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Brief Reports 


The Journal of Consulting Psychology will 
accept Brief Reports of research studies in 
clinical psychology for early publication with- 
out expense to the author. The procedure is 
intended to permit the publication of soundly 
designed studies of specialized interest or lim- 
ited importance which cannot now be ac- 
cepted because of lack of space. Several pages 
in each issue will be devoted to Brief Reports, 
published in the order of their receipt with- 
out respect to the dates of receipt of the regu- 
lar articles. Most Brief Reports appear in the 
first or second issue to go to press following 
their final acceptance. 


An author who wishes to submit a Brief 
Report: 


1. Sends the Brief Report, limited to one printed 
page and prepared according to the specifications 
given below. 

2. Also sends to the Editor a full report of the re- 
search study, in sufficient detail to give a clear ac- 
count of its background, procedure, results, and con- 
clusions, which will be filed with the American 
Documentation Institute to insure indefinite avail- 
ability. 

3. Prepares at least 100 mimeographed copies of 
the full report, which the author will send without 
charge to all who request it as long as the supply 
lasts. 


4. Agrees not to submit the full report to another 
journal of general circulation. 


Specifications 


Brief Report. The Brief Report should give 
a clear, condensed summary of the procedure 
of the study and as full an account of the re- 
sults as space permits. 


To insure that the Brief Report will be no 
longer than one printed page, its typescript, 
including all matter except the title and the 
author’s lines, must not exceed 70 lines av- 
eraging 42 characters and spaces in length. 
Set the typewriter margins for short lines of 
42 characters, which are 3.5 inches long in 
elite typing, and 4.2 inches long in pica. 

The manuscript of the Brief Report must 
be double spaced throughout. Except for its 
short lines, it follows the standard style (1). 
Headings, tables, and references are avoided 
or, if essential, must be counted in the 70 
lines. Each Brief Report must be accom- 
panied by a footnote in the style below, 
which is typed on a separate sheet and not 
counted in the 70-line quota: * 


1An extended report of this study may be ob- 
tained without charge from John Doe, 300 Market 
St., Prospect 6, Mass. (giving the author’s full name 
and address), or for a fee from the American Docu- 
mentation Institute. Order Document No. from 
ADI Auxiliary Publications Project, Photoduplica- 
tion Service, Library of Congress, Washington 25, 
D. C., remitting in advance $— for microfilm or 
$—— for photocopies. Make checks payable to Chief, 
Photoduplication Service, Library of Congress. 





Extended report. The full report is pre- 
pared in the style specified by the Publica- 
tion Manual (1), except that it may be typed 
with single spacing for economy in photo- 
duplication by the ADI. 
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1. American Psychological Association. Council of 
Editors. Publication manual of the American 
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This article is concerned with a test taker’s 
set to mark a particular response category 
“ves,” “true,” or “agree.” According to Cron- 
bach (5, 6), who has reviewed several re- 
sponse sets that influence a test taker’s behav- 
ior, the variance generated by the response 
set discussed here is regarded as undesirable 
since, according to him, it contributes only 
error variance and cannot be used to increase 
the usefulness of a test. In this article a dif- 
ferent rationale, supported by empirical evi- 
dence, will show how a measure of this re- 
sponse set can be used to improve the validity 
of many tests. 


Nature of the Problem 


Imbalance in scored responses. An inspec- 
tion of responses scored in most personality 
tests or scales reveals that the significant re- 
sponses are predominantly in one direction— 
ie., “yes,” “true,” etc. For example, 96 of 
the 100 items in the Cornell Index are keyed 
“ves,” and 47 of 60 items in the hysteria 
(Hy) scale of the Minnesota Multiphasic 
Personality Inventory (MMPI) are keyed 
“false.” Some other tests in which a large ma- 
jority of scored responses fall into a par- 
ticular response category are the California 
Psychological Inventory, the Humm-Wads- 
worth Temperament Analysis, the Minnesota 
Teacher Attitude Inventory, and the Strong 
Vocational Interest Blank. 

Response set as a suppressor variable. The 
contention here is that when the number of 


1 This article was prepared in connection with a 
“group tests” course taught by the writer during the 
Winter Quarter, 1955, at the University of Minne- 
sota as part of his Ford Foundation Teaching Intern 
Fellowship. 


scored responses is unequally divided between 
all possible response categories the effect of a 
response set may be substantial. If response 
set per se is not directly related to the cri- 
terion, it operates to introduce error in the 
criterion-predictor. This is usually the case. 
If, however, response set is related to another 
variable that is related to the criterion, then 
response set may be used as a suppressor 
variable (14, 17, 18, 19, 21, 22, 25). By sup- 
pressing or removing the influence of response 
set from the criterion-predictor an improve- 
ment in test validity can be effected. 


The Opinion, Attitude, and Interest Survey 


A review (7) of personality tests validated 
to predict academic achievement discloses 
that for the more valid tests the response 
predictive of high achievement is usually 
false, no, or disagree. An imbalance was also 
found in an item analysis of the Opinion, 
Attitude, and Interest Survey (OAIS) a test 
constructed by the writer (7, 8). Results 
from the OAIS will be used to illustrate the 
major thesis of this paper. 

Criterion group responses. The criterion 
groups used for the OAIS item analysis were 
340 achieving and 473 nonachieving fresh- 
man men in the College of Science, Litera- 
ture, and the Arts (SLA), University of Min- 
nesota. The groups were equated for tested 
academic ability (ACE), and all had ACE 
percentile ranks above 54 on university fresh- 
man norms. Of the 126 discriminating state- 
ments only 33, or 26 per cent, are scored for 
high achievement if answered true. Why a 
negative reaction to some statements should 
be indicative of high achievement is difficult 
to understand. The statements in the OAIS 
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are not concerned with study habits and atti- 
tudes. Typical statements and the response 
predictive of superior achievement are: 


Too much fuss is made over famous people (F); 
Often I wish I had more freedom (F); Most people 
have a very good imagination (F); I often find my- 
self imitating someone I consider superior (F); Peo- 
ple should go to church more than they do (F); I 
enjoy sleeping during the day more than at night 
(F); I'd rather be thought of as unusual than nerv- 
ous (T); and Marriage before the age of twenty is 
usually foolish (T). 


Interpretation and effect of imbalance. The 
writer originally held the view that the im- 
balance of true and false items was simply a 
function of how the statements were written. 
But this seems unlikely since on some tests 
(e.g., MMPI) having scales for several per- 
sonality dimensions, the scored direction is 
different for different scales. The nature of 
the criterion groups, not the items, is prob- 
ably responsible for the imbalance. The pres- 
ent interest is not in why criterion groups re- 
spond the way they do but in the effect of a 
strong response set. The specific question is 
“Can a measure of response set be used to in- 
crease test validity?” 

It is obvious from the scored direction of 
statements in the OAIS that a test taker not 
reading the statements but marking false as 
the answer to all statements would receive a 
good score (a standard score of about 90). 
His high criterion-predictor score probably 
does not describe his “academic” personality. 
A test user viewing his score would usually 
overpredict his academic achievement. An- 
other student having a “legitimate” response 
set for saying true would tend to mark an 
abnormal number of answers predictive of 
low academic achievement. 

A measure of response set. One index or 
measure of set for answering “true” could be 
obtained by counting the times a person 
marked “true” to statements in a test. This 
method was used by Humm and Wadsworth 
(15) to get a measure of suggestibility or co- 
operativeness. They counted the times a test 
taker answered “no” to questions in the 
Temperament Schedule. Answer sheets hav- 
ing a very large or a very small “no count” 
are not considered sufficiently valid for fur- 
ther analysis. According to their suggested 
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cutoff points, 30 to 60 per cent of the answer 
sheets would be rejected as invalid (1, 16). 

A much more sensitive index of response set 
would be a count of the “true” responses for 
those statements which 49 to 51 per cent of 
the test takers mark true. For such state- 
ments there is no general agreement on an 
answer; they have maximum “controversial- 
ity.” A person without a response set would 
respond true as often as false on these items. 
A person with a high score could be thought 
to have a strong set for marking true. A low 
score would indicate the opposite. It should 
be emphasized that items for such a “set” 
scale are selected independently of any cri- 
terion external to the test. In practice it is 
difficult to get enough items to form a scale 
at the 49-51 per cent level of controversial- 
ity. Accordingly, it was decided to accept for 
the “true” response-set scale, Set 7, those 
statements in the OAIS which 40 to 60 per 
cent of each criterion group marked true. The 
Set T scale consists of 69. statements scored 
in the true direction. Typical Set 7 state- 
ments are: 


My friends consider me to be a talkative person; 
I do many things just to avoid criticism; Some peo- 
ple like to be considered rude and crude; I generally 
avoid domineering people; Others regard me as hav- 
ing a lot of patience; I’d rather talk than listen in 
a conversation; I am a very practical person; and 
Routine things require too much time. 


Use of the Set T scale as a predictor. To 
determine whether Set T could be used as a 
suppressor variable, scores from it were cor- 
related with the scores obtained from two 
grade-predictor scales constructed from an 
item analysis of the OAIS, and with the 
honor-point ratio (HPR) of freshmen in SLA. 
The first of the two grade-predictor scales, S, 
was based upon the familiar single response 
(true or false) item analysis; the second 
scale, C, was based upon a configural re- 
sponse item analysis which involves scoring 
simultaneously three responses (two content 
and one intensity) (7, 9). The statistics re- 
ported below were obtained from a cross- 
validation sample of 209 males who took the 
OAIS as part of the regular entering fresh- 
man test battery at least six weeks before the 
beginning of fall quarter classes. 

Set T correlated — .036 with HPR, — .569 
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with S, and — .505 with C. The r of S with 
HPR was .326? and the r of C with HPR 
was .376.° 

To see whether a combination of Set 7 
scores and grade-predictor scores S and C 
would result in an increase in predictive effi- 
ciency, a multiple correlation coefficient (R) 
was computed. 

The R of Set T and S with HPR was .394; 
the R of Set T and C with HPR was .427. 
The increase in both cases is appreciable. 

Optimum weighting of suppressor and cri- 
terion predictor. It should be pointed out that 
the r of the corrected predictor with HPR 
will equal these R’s only when the optimum 
amount of the suppressor, Set 7, is added to 
the uncorrected predictors, S and C. If the 
suppressor and the uncorrected predictor are 
simply added, they are weighted according to 
their standard deviation. Several “trial and 
error” attempts will usually lead to an ac- 
ceptable fraction of the suppressor to be 
added to the uncorrected predictor. Optimum 
weighting of suppressor and uncorrected scores 
probably should result in mo correlation be- 
tween the suppressor and corrected scores. 

At first Set JT scores were added to S and 
C, resulting in APS’ (academic personality- 
single response analysis), and APC’ (aca- 
demic personality-configural response analy- 
sis). The SD’s of Set T, S, and C were 7.8, 
10.0, and 12.3, respectively. It will be noted 
that Set T is weighted more heavily in APS’ 
by this simple addition of scores. The 7 of 
APS’ with HPR was .351, and the r of APC’ 
with HPR was .404. Since these validity co- 
efficients (r’s) are lower than the R’s ob- 
tained above it can be concluded that the 
grade predictors, S and C, were not weighted 
optimally with Set JT. The r of Set T with 
APS’ was .25 and the r of Set T with APC’ 
was .15 which indicates that Set 7 was 
weighted too heavily. However when .7 Set T 
was added to S to form APS, the r of APS 
with HPR was .38; the r of .7 Set T with 
APS was — .01. When .8 Set T was added to 


2For the subgroup (VN = 138) with ACE percen- 
tile rank above 54 the r was .41, and for the sub- 
group (VY =71) with ACE percentile rank below 55 
the r was .10. 

3 The r’s here were 44 and .20 for the high and 
low ACE subgroups respectively. 
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C to form APC, the r of APC with HPR was 
Al; the r of .8 Set T with APC was .01.* For 
practical purposes, the influence of response 
set has been removed from the criterion-pre- 
dictors as evidenced by the near-zero r with 
the suppressor. The important point is that a 
measure of response set, Set 7, can be used 
to increase the accuracy of grade predictions, 
even though scores from it are not directly 
related to grade achievement. 


Similarity of Set T of the OAIS 
and K of the MMPI 


Construction of the K scale. Since the Set 
T scale of the OAIS functions as a suppressor 
variable, its relation to the K scale (18, 22) 
of the MMPI is of interest. It will be re- 
called that 22 of the 30 items in K were ob- 
tained by comparing responses of 50 psycho- 
pathic patients having a high Z score and a 
normal MMPI profile, with responses of 604 
normals (339 Minnesota men and women, 
and 265 college men and women) which were 
used in constructing other MMPI scales. The 
other eight items were selected from a study 
of faked responses (18). The 22 items will be 
referred to as Lg items, as Meehl and Hath- 
away designated them, and the remaining 
eight items as “fake” items. 

Structure of the K scale. An examination 
of the direction of scored responses in K re- 
veals that 29 are false and only one is true,’ 
the true item is from the Lg items. A study of 
the percentage of true-false responses of the 
604 normal cases® was made by averaging 
the percentage values for the 339 Minnesota 
and 265 college cases. Since not all normals 
responded true or false to each item, half the 
“?” responses were added to the true count. 
This crude analysis shows that 20 of the 30 
items are answered true by 40 to 60 per cent 
of the normals. Without addition of half the 
“?” responses there are still 10 of the 30 


*The r of APS with ACE was 
with ACE was also .12. 

5In this connection it is of some interest to find 
that of 40 items in Gough’s personality scale to 
measure ability to make a good impression (11), 31 
are scored false. 

6 The writer is grateful to Dr. S. R. Hathaway for 
supplying him the percentage frequencies for the two 
subgroups. 


12; the r of APC 
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items beyond the arbitrary limits. Only one 
of the Le items is not within a 36-64 per cent 
range. This item is: “I have very few quar- 
rels with members of my family.” Only 18 
per cent of the normals give the “plus” re- 
sponse; this K item is scored true. Six of the 
eight “fake” items fall outside the 40 to 60 
limits, but two (60.3 and 60.5 per cent) al- 
most qualify. 

For practical purposes 22, or 73.3 per cent, 
of the items are in the controversial range; 
but strictly speaking 20, or 67.7 per cent, 
meet the requirement. 

These percentages by themselves are not 
important if approximately two-thirds of the 
items in the MMPI are within the 40 to 60 
per cent limits. An examination of 495 items 
on which data were available shows only 88, 
or 17.8 per cent, are found within the limits. 
Since 20 of the 88 are in the K scale, only 68 
or 14.6 per cent of 465 non-K items are be- 
tween 40 and 60 per cent. If half the “?” re- 
sponses are not added, only 55, or 11.8 per 
cent, of 465 non-K items are within the limits. 
It is clear from this that the procedure used 
in constructing K has resulted in selecting 
items useful for measuring response set as 
described above in the OAIS analysis. It is 
possible that if a different psychopathic cri- 
terion group or a larger number of psycho- 
pathic patients were used for item analysis 
more than two-thirds of the K items would 
have been found in the controversial range. 

Functional similarity of K and Set T. Fur- 
ther evidence on the similarity of Set T and 
K is found in their correlation with each 
other. The r of Set T with K for 209 fresh- 
man men is — .63; the r for 138 high ACE 
freshmen is — .58, and the r for 71 low ACE 
freshmen is — .71. The latter subgroup re- 
sembles the MMPI normal population more 
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closely than the total SLA freshman group. 
The test-retest reliability of K for Univer- 
sity of Minnesota freshmen was reported to 
be .51 by Berdie (2). Rosen (23) reported a 
.65 retest reliability for psychiatric patients. 
Hathaway and Monachesi (13) reported a re- 
test reliability of .66 and Meehl and Hath- 
away (22) reported reliabilities of .72 and 
.74. It would appear that Set T of the OAIS 
predicts K of the MMPI as well as K pre- 
dicts itself. 

The r’s of Set T and K with several non- 
MMPI variables are presented in Table 1. 
The agreement in the r’s for uncorrected 
scores is quite strong. Minus K, rather than 
K, has been used to simplify comparisons. 
The Time score in the table is the time in 
minutes a test taker requires to complete the 
OAIS. The J (Infrequency) scale is similar 
to the F scale of the MMPI in that it is ob- 
tained by scoring those response configura- 
tions which are given by less than 4 per cent 
of each of the original criterion groups. HSRP 
is the test taker’s high school rank percentile 
transformed to prohibit values. The other sym- 
bols are not new. The r’s of Set T and K with 
S, C, APS’, APC’, APS, and APC reveal a 
marked similarity in the function of Set T 
and K. The lower r’s of Set T and K with 
scales employing the configural items prob- 
ably indicate that scores obtained from such 
scales are less subject to distortions because 
of response set, “defensiveness,” etc. 

Correlation of K with clinical scales. Table 
2 presents some r’s of K with uncorrected 
MMPI scales. The consistently high negative 
r’s for K with Pt, Sc, and Ma are noteworthy. 
The r’s of K with Pd and Hs, though nega- 
tive, are generally lower. The r’s with Hy are 
consistently positive and somewhat higher 
than the r’s for Pd. A generally negative but 


Table 1 
Correlations of Set T and K with Certain Non-MMPI Variables for 209 Freshman Men 





























Variable 
Coop. 
Suppressor HPR Time I Ss Cc APS’ APC’ APS APC HSRP ACE Eng. Ohio 
Set T —O4 —17 —12 -—-57 -—S5l 25 is —0ol 01 00 —04 -06 -—10 
—K —03 —09 02 —-50 —38 00 03 —17 06 01 —03 —03 —05 
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Table 2 








Sample Sex N Hs D 
Graduates (18) M 100 —25 08 
College (24) M 112 —31 —07 
College (4) M 179 —48 —20 
Normal (22) M 100 —30 15 
Normal (22) F 100 —35 —03 
NP patients (24) M 110 —20 —19 
NP patients (26) M 100 —10 — OF 
Abnormals (22) M 100 —42 —29 
Abnormals (22) F 100 —17 —16 


Correlations of K with Uncorrected MMPI Clinical Scales 
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MMPI Scale 

Hy Pd Mf Pa Pi Se Ma 
53 —(9 —O8 22 — 65 —46 —42 
50 —09 —16 19 — 59 —51 — 30 
21 —26 —23 —12 —62 — 57 37 
48 —17 —07 —67 — 59 — 36 
30 ~06 —()2 — 64 — 58 28 
14 —-38 -19 -—-26 —66 55 45 
15 —16 —15 —30 —25 —10 
11 —26 — O08 —19 ~ 60 — 37 
17 —21 04 13 63 58 38 


low r is found for Pa, Mf, and D. The first 
five clinical scales mentioned above are those 
which are corrected by K. In view of the r’s 
with Hy, it would appear that a correction 
for it would be useful; more will be said on 
this later. 

Correlation of K and Set T with MMPI 
scales and optimum weighting. Table 3 shows 
the r’s of Set T and K with five corrected 
and seven uncorrected scales. These r’s are 
especially interesting. The r’s for Set T and 
K are remarkably alike for all variables (the 
signs are opposite). From the correlations 
with K it would appear that K and Pé# and 
Sc have been weighted reasonably well, as 
shown by the low r of K with K-corrected 
Pt and Sc; perhaps Hs and Pd have been 
overweighted and Ma underweighted with K. 


These are also the conclusions to be drawn 
from the v’s of Set T with the K-corrected 
scales. The r’s of K with uncorrected scales 
are similar to those in Table 2. The r’s of Set 
T with uncorrected scales are also similar to 
the r’s in Table 2. It would appear that Set T 
and K accomplish essentially the same thing. 
Whether Set 7 is superior to K is not deter- 
minable from the available data. While K 
may not be as pure a measure of response 
set, it may do something in addition to what 
is done by Set 7. 

Prior to collecting all the data involving K 
the writer suggested to Hathaway that per- 
haps 4K or .5K should be added to Ma 
rather than .2K. Hathaway located in his files 
the differential ratio (18) curve for the sam- 
ple used to determine how much K should be 


Table 3 





Correlations of Set T and K with Five K-corrected and Seven Uncorrected MMPI Scales 


MMPI Scale 








Hs Pd Pt Se Ma 

Suppressor Sample Sex W L F +5K D Hy +A4K Mf Pa +10K +10K +.2K Si 

Set T College 
freshmen (7) M 209 —23 27 —26 02 —28 —17 20 «(04 02 02 32 25 

EK College 
freshmen M 209 44 —4l1 42 03 36 26 -—23 —09 —05 i2 —31 —49 

K Ninth 
graders (13) M 200 52 —42 45 08 45 27 -—05 —14 il 08 —23 —27 

K Ninth 
graders (13) F 200 44 —44 22 07 32 11 03 —08 -—01 -—07 -—25 —52 
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added to Ma. This curve showed the validity 
of Ma plus K increased rapidly to .2K, re- 
mained at this level until .5X, then decreased. 
Hathaway recalled that .2K was selected to 
avoid adding additional error variance. This 
writer believes in view of the data collected 
to date that the addition of 4K or .5K would 
improve the validity of Ma. As mentioned 
above the r’s of Set T and K with Hs and 
Pd in Table 3 are offered as evidence that too 
much K has been added to Hs and Pd. Ad- 
ditional evidence that the Pd scale may be 
overcorrected is found in a reanalysis of Cap- 
well’s data (3). McKinley, Hathaway, and 
Meehl (18) reported the K-corrected Pd scale 
decreased the differentiation of Capwell’s ado- 
lescent delinquents from their matched con- 
trols. 


The MMPI: Operation of the K Scale 


The manner in which K operates to im- 
prove the validity of certain MMPI scales 
has not been described in the literature. One 
way of seeing how K works is through an 
analysis of the scored responses for each 
clinical scale. It should be obvious that focus- 
ing attention on the scored direction of re- 
sponses is not a very precise method of analy- 
sis. This approach ignores the percentage of 
clinical and norma! persons responding to 
each item; one item to which a large num- 
ber of one or both groups respond in the 
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scored direction may count as much as sev- 
eral items to which a very small number of 
one or both groups respond in the scored di- 
rection: e.g., half the Hs items are answered 
in the plus direction by less than 15 per cent 
of the normal population. Other factors such 
as unusual wording of items may contaminate 
the results: e.g., many Hs items are worded 
in the negative, “I do not often notice my 
ears ringing or buzzing.” The effect of items 
which are common to several scales is another 
important factor. Wheeler, Little, and Lehner 
(24) point out, for example, that an r of 
+ .46 can be expected between Hs and Hy 
because of 20 overlapping items. And perhaps 
most important, response set, as Cronbach 
(5, 6) points out, is probably of little conse- 
quence when the items are unambiguous. 
Direction and significance of scored re- 
Sponses for uncorrected scales. It was men- 
tioned earlier that the items in different 
MMPI scales are scored in different direc- 
tions. Table 4 presents the number and per- 
centage of scored responses as well as the 
standard score a test taker would have if all 
items were marked true, or if all items were 
marked false (12). The magnitude of the dif- 
ferences in standard scores and percentage of 
true and false should be noted. The “true re- 
sponse” profile increases from left to right 
and resembles what has become known as a 
psychotic-type profile, the “false response” 


Table 4 


The Number and Percentage of True-False Scored Responses and Standard Scores (for Men) 
for 13 Uncorrected MMPI Variables 











MMPI Scale 











Scored response L F K Hs D Hy Pd Mf Pa Pt Sc Ma Si 
True 0 44 1 11 20 13 24 28 25 39 a ss 
False 15 20 29 22 40 47 26 32 15 9 i... a. a 
Total 15 64 30 33 60 60 50 60 40 48 78 46 70 
Percentage true 0 69 3 33 33 22 48 47 63 81 78 76 SO 
Percentage false 100 31 97 67 67 78 52 53 37 19 ana «& ® 
Percentage difference —100 38 —-94 -34 -34 -56 -4 -6 26 «662 56 52 0 
T score when all 

marked true 50— 80+ 29 65 58 44 75 65 100 91 120 97 62 
T score when all 

marked false 80+ 80+ 81 90 106 106 81 73 70 49 6O 43 62 
T score difference —-52 -35 -—-48 -62 -6 -8 30 «42 oO 54 0 
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Table 5 
The Number and Percentage of True-False Scored Responses and Standard Scores (for Men) 


for Five K-corrected MMPI Clinical Scales 











MMPI Scale 





Pd Pi Se 


Ma 





Hs 
Scored response +.5K +A4K +1.0K +1.0K +-.2K 
True 12 25 40 62 35 
False 37 38 38 46 17 
Total 49 63 78 108 52 
Percentage true 24 40 51 57 65 
Percentage false 76 60 49 43 35 
Percentage difference —52 —20 2 14 30 
T score when all marked true 52 64 85 120 96 
T score when all marked false 116 95 81 96 50 
T score difference — 64 —31 4 24 46 





profile decreases from left to right and re- 
sembles the neurotic-type profile. 

A scale to measure response set is likely to 
be useful when the discrepancy in percentage 
of true-false is greater than, say, 40 (the dis- 
crepancy was 48 per cent on Scale S of the 
OAIS). Application of this discrepancy con- 
ceivably could result in improving 4 of 10 
MMPI clinical scales. The three scales hav- 
ing the majority of scored responses as true 
(Pt, Sc, and Ma) could be adjusted by add- 
ing, say, Set False scores and the one scale 
having most scored responses as false, Hy, 
could be adjusted by subtracting Set False 
scores. The fraction of Set False to be added 
to, or subtracted from the clinical scale can 
be determined empirically as described earlier 
for Set T of the OAIS. 

Of course since only the scored direction is 
being taken into account, a measure of re- 
sponse set will not necessarily improve the 
validity of these scales. A correlation, posi- 
tive or negative, between Set False and a 
clinical scale probably indicates that Set 
False could be used as a suppressor. If Set 
False (or for that matter any variable) is 
not related to the clinical behavior, but is 
related to the MMPI clinical scale, then Set 
False may be used as a suppressor even 
though there is no true-false imbalance. 

If K is accepted as the measure of response 
set in place of Set False, to add false re- 
sponses to Pt, Sc, and Ma, and subtract false 


responses from Hy, the validity of the four 
clinical scales would be improved. It will be 
recalled (Tables 2 and 3) that the r’s for the 
uncorrected clinical scales with K were larg- 
est for Pt, Sc, Ma, and Hy. 

Interpretation of K scale. Meehl and Hath- 
away (22) reported that r’s between K and a 
clinical scale “as low as .20 can be utilized 
to yield very significant and useful improve- 
ments in discrimination.” The validity of Pt, 
Sc, and Ma was improved by the addition of 
K or a fraction of K (18). It appears (18, 
22) that attempts were made to add K to 
Hy. That they should attempt addition of K 
and Hy follows from the commonly held view 
of the K scale. McKinley, Hathaway, and 
Meehl (18) wrote, “since high K scores rep- 
resent the defensive or fake good end of the 
test attitude continuum, the most obvious ap- 
proach to the problem is to add K (or some 
fraction of K) to the raw score on each per- 
sonality variable, i.e., increase the score in the 
direction of abnormality.” They apparently 
felt the K correction for Hy resulted in no im- 
provement because the Hy scale contained 
10 of the 30 K items which were scored in 
the same direction. If K is viewed as a meas- 
ure of response set, then K or some fraction 
of it should be subtracted from Hy. That is, 
a person with a high K has probably earned 
a higher Hy than he should have. To say that 
he has “faked bad” is debatable. If K is not 
an adequate measure of response set, then 
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false responses from a scale such as Set F 
could be subtracted from Hy, or true responses 
from a Set T scale could be added to Hy. 

Direction and significance of scored re- 
sponses for K-corrected scales. Table 5 shows 
data similar to those found in Table 4 except 
that the K-corrected scales are described. On 
the basis of the number of true and false 
scored responses in Table 5 the r’s for Set T 
and K with the K-corrected scales in Table 3 
are to be expected: e.g., the r between K and 
Hs + .5K is positive, the r between K and 
Ma + .2K is negative, and the r between K 
and Pt + 1.0K is about zero. 

For all K-corrected scales in Table 3 the 
r’s with K are consistently less negative (some 
are positive) than the comparable r’s in Table 
2. From Tables 4 and 5 it can be seen that 
these changes in r are accompanied by a con- 
sistent increase in the proportion of false re- 
sponses in the clinical scales. However the r’s 
of Tables 2 and 3 indicate that the true-false 
imbalance shown in Tables 4 and 5 will not 
explain all the r’s obtained for the suppres- 
sors K and Set JT with the clinical scales: 
e.g., uncorrected Hs correlates negatively with 
K despite the fact that 67 per cent of the 33 
Hs items are scored false; Si correlates posi- 
tively with Set T and negatively with K de- 
spite the fact that 50 per cent of the Si items 
are scored false. Several possible explanations 
for these unexpected correlations have been 
mentioned. Perhaps the important point here 
is that a measure of response set such as K 
or Set 7 may function as a suppressor when 
there is no true-false imbalance, and may not 
function as a suppressor when there is a 
moderate true-false imbalance. 

Correlations of K with other MMPI scales. 
Of considerable interest is the relation of K 
to several nonclinical scales which were con- 
structed for the MMPI and reported by 
Meehl and Hathaway (22). K was found to 
correlate — .70 with N, a scale derived em- 
pirically by Meehl (20) to correct for self- 
criticality of certain normal plus-getters who 
show deviant profiles. As would be expected, 
the majority (50 of 78) of the items are 
scored true. The r of K with G and + (plus) 
was — .76 and — .64, respectively; both these 
scales were derived wholly by a method of 
internal consistency and without regard to 
nontest behavior. Only 9 of 62 items in G, 
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and 11 of 55 items in + are scored false.’ 
The r of K with Cy, the correction factor for 
the old hypochondriasis scale (H-Cq), was 
— .67; Cy was derived from an item analysis 
of the responses of the original 50 hypochon- 
driacal criterion cases and 50 nonhypochon- 
driacal psychiatric patients who scored high on 
H.* Of the 48 items in Cy only 13 are scored 
false. The r of K with Hy-O was .81; the 
Hy-O scale consists of the zero items in the 
Hy scale and 19 of its 20 items are scored 
false. Meehl and Hathaway reported a factor 
analysis of a correlation matrix of these spe- 
cial scales revealed “one common factor is 
quite sufficient to account for the intercorre- 
lations of these scales.” Jt may be that this 
common factor is response set, and that the 
best measure of it might be obtained in the 
manner described for Set 7. 


Some Implications for Personality 
Test Research 


Studies need to be done on the nontest be- 
havior of those with a strong or weak response 
set to mark a particular response alternative. 
The determination as to what is measured by 
scales such as Set T and K should be made 
in this way. Too frequently scores from the 
K scale have been correlated with scores from 
other tests, and judgments have been made 
as to what K was really measuring. Validity 
indicators such as LZ and F should be com- 
posed of items without a true-false imbalance 
so that independent measures are obtained. 

In view of the ease with which an OAIS 
Set T-type scale can be constructed, the 
writer recommends that response set scales 
be prepared for other empirically validated 
personality tests; scales correlating with re- 
sponse set should be corrected. It seems un- 
likely that response set keys would be able to 
bring validity to tests which are not validated 
empirically even though response set may be 
a major component in a test taker’s score. It 
is important that those who derive and use 
empirically validated personality tests be con- 


7 The writer obtained the scoring stencil for G and 
+ from Dr. S. R. Hathaway to check the inference 
that most of the items were scored true. 

8 The construction of the present 33 item Hs scale 
has not been described in the literature; it consists 
of 31 items taken from the 55 item H scale, and two 
items which were not in the old hypochondriasis 
(H-Cxu) scale. 


Response Set as a Suppressor Variable 


cerned not only with the degree to which an 
item discriminates between criterion groups 
but also with the direction of the scored re- 
sponses. A recent review by Furst and Fricke 
(10) discloses that rarely does a scale con- 
structor indicate the direction of the scored 
responses. The correlations between many 
scales and tests can be “explained,” at least 
in part, by an inspection of the scored re- 
sponses and recognition of the role that re- 
sponse set may have played. 


Summary 


A method for constructing a scale to meas- 
ure a test taker’s set to say “true” to person- 
ality test items was described. The Set T 
scale of the Opinion, Attitude, and Interest 
Survey (OAIS) was used to show how re- 
sponse set could be harnessed to function as 
a suppressor variable and improve the va- 
lidity of two empirically validated grade-pre- 
dictor scales. 

The marked structural and functional simi- 
larity of Set T of the OAIS and K of the 
MMPI was drawn upon to challenge the tra- 
ditional interpretation of the K scale. Some 
evidence was assembled which indicates that 
some of the MMPI scales are not optimally 
K-corrected. 

It was suggested that those who construct 
and use personality tests should be concerned 
with the direction of the scored responses as 
well as with the degree to which an item dis- 
criminates. 


Received August 29, 1955 
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Two Measures of Anxiety’ 


Erwin J. Lotsof and Walter L. Downing 


University of California, Los Angeles 


The chronic anxiety syndrome, as defined 
by Cameron, is “characterized by the pres- 
ence of persistently heightened skeletal and 
visceral tensions . . .” (2, p. 249). Using this 
definition, Taylor (5) devised a paper-and- 
pencil test to measure manifest anxiety. 
Mowrer (4) and Bixenstein (1) use the term 
tension in the therapeutic situation in a way 
which appears to be synonymous with anx- 
iety. They measure changes in tension by re- 
cording palmar perspiration, and by comput- 
ing a Discomfort-Relief Quotient. In view of 
the apparent similarity of these two concepts, 
anxiety and tension, it was felt that the two 
measures, if applied to the same population, 
should show a substantial correlation. 

The Taylor Scale of Manifest Anxiety 
(TS) was administered to an introductory 
psychology class. A week later, 30 of the stu- 
dents participated in another “experiment” 
in which palmar perspiration (PP) was re- 
corded by the method described in Mowrer 
(4). No indication was given the participants 
that the two measures were related, nor that 
they concerned anxiety. Two judges ranked 
the fingerprints for density in an almost 
identical sequence (rho = .99). The range of 
TS scores was 1 to 26, with a mean of 10.23, 
and an SD of 6.33. The TS scores were 
ranked, and the two measures gave a rank- 
order correlation of 0.00. 


1An extended report of this study may be ob- 
tained without charge from Erwin J. Lotsof, Dept. 
of Psychology, University of California, Los An- 
geles 24, Calif., or for a fee from the American Docu- 
mentation Institute. Order Document No. 4827 from 
the ADI Auxiliary Publications Project, Photodupli- 
cation Service, Library of Congress, Washington 25, 
D. C., remitting in advance $1.25 for microfilm or 
$1.25 for photocopies. Make checks payable to Chief, 
Photoduplication Service, Library of Congress. 


Although the number of Ss was small, some 
correlation would be expected if any relation- 
ship existed between the variables being meas- 
ured. If anxiety is conceived of as “permeat- 
ing” the entire personality, and if tension is 
conceived of as a response of the total or- 
ganism, then the results are difficult to ex- 
plain. 

Assuming that the tests are valid measures 
of manifest anxiety and of tension, two hy- 
potheses might be offered to account for the 
findings. (a) Anxiety and tension are inter- 
vening variables whose consequences become 
manifest in certain situations. The negative 
findings are due to the different testing times. 
(6) The terms have different sets of referents. 
The test of anxiety is psychological; that for 
tension is physiological. Although there is a 
large semantic overlap, the referents may be 
concerned not with different levels of the 
same event, but with different events. 

It would appear desirable, therefore, in 
using concepts of anxiety and tension, to fol- 
low Maslow (3) in indicating the source of 
the concept with a subscript. 


Brief Report. 
Received February 23, 1956. 
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Social Desirability in the MMPI’ 


Wilbert E. Fordyce 


Veterans Administration Hospital, Seattle, Washington 


The concept of social desirability (here- 
after, S-D) has been explored and elaborated 
in several studies (1, 2, 3, 4, 6). The concept 
concerns elucidating a basis for describing and 
controlling test-taking attitudes on person- 
ality inventories and schedules. A preliminary 
definition is, “Consensus judgments as to what 
behavior, feelings, and attitudes win social 
approval in American society.” A more opera- 
tional specification will be given below. 


Problem 


The impetus for studying S-D stems largely 
from a study by Edwards (1) in which he 
found frequency of endorsement of person- 
ality inventory items as self-descriptive cor- 
related .87 with the rated social-desirability 
values of the items. Edwards proceeded to 
construct the Personal Preference Schedule 
(3), an objective-type personality test, based 
on Murray’s need system, in such a way as 
to control this test-taking attitude of endors- 
ing socially desirable items. The construction 
of the test involves the assumption, supported 
by the studies cited above, that a major part 
of the variance in these types of tests is ac- 
counted for by the test taker’s desire to sub- 
scribe to characteristics that are desirable in 
our society. The present study concerns itself 
with an extension of this concept of S-D to 
the Minnesota Multiphasic Personality In- 
ventory (MMPI) (5). 

The principal validity scales of the MMPI 
(F and K) attempt to characterize the sub- 
ject’s approach to the test and his readiness 
to subscribe to items in the clinical scales de- 
scriptive of mental illness. The relationships 
between these validity scales and the clinical 

1From Veterans Administration Hospital, Seattle, 


Washington, and Department of Psychiatry, Univer- 
sity of Washington School of Medicine. 


scales are formalized in the instance of K by 
applying a correction to five of the clinical 
scales based on the K score. With the general 
notion that the subject’s approach to the test 
can be substantially characterized by his 
readiness to respond to socially desirable or 
socially undesirable items, and that this ap- 
proach is reflected in both the validity and 
clinical scales of the MMPI, the hypotheses 
for this study are: 

a. There will be a significant correlation be- 
tween an S-D scale and the F and K ®* scales 
of the MMPI. 

b. There will be a significant correlation 
between an S-D scale and the clinical scales 
of the MMPI. 


Method 


The measure of S-D used in this study de- 
fines the concept. A subset of MMPI items 
consisting of those making up the F, K, and 
Taylor Anxiety (7) scales was presented to a 
group of 10 judges. The judges ranged from 
faculty to students and departmental secre- 
taries of the University of Washington De- 
partment of Psychology.* They were asked 
to select the items, and the direction of their 
scoring, which would be socially desirable re- 
sponses when endorsed by test subjects. Only 
those items on which all 10 judges agreed as 
to selection and direction of scoring (true or 
false) made up the S-D scale. Seventy-nine 
items met this criterion. The S-D score be- 
came the number of these 79 items subjects 
answered in the appropriate direction. 

Subjects consisted of all the available, com- 
pleted, group-form MMPI records on male 


2 Edwards (3) has already reported a correlation 
of 63 between S-D and K. 

8 The scale was constructed by Allen L. Edwards, 
University of Washington. 
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Table 1 











Item Overlap Between S-D Scale and Other MMPI Scales 





MMPI Scale 











No. of items L F se 1° FS RA HH BR & Re 
No. in MMPI Scale is & Dn BO OO 5SO © 4 4 7 4 
Scored same in S-D 0 0 5 2 1 2 0 0 0 0 3 
Scored opposite in S-D 0 48 i v7) 7 4 eo. ,» 2 





psychiatric and/or neurological patients in 
the VA Hospital, Seattle. Only the first 
MMPI record obtained in this hospital was 
used. Records with more than 25 items 
omitted were excluded. There were 97 rec- 
ords meeting these criteria. Since testing oc- 
curs on a consultation basis, this sample is 
not a random representation of the neuropsy- 
chiatric population in this hospital. However, 
except for the exclusion of patients too hy- 
peractive, confused, or with motor disabili- 
ties prohibiting handling the test, the sample 
would appear to be reasonably representative 
of hospitalized male veteran psychiatric and/ 
or neurological patients. 


Results and Discussion 


Since both hypotheses involve the degree 
of correspondence between the S-D scale and 
the regular MMPI scales, the question of in- 
fluence of item overlap becomes important. 
This issue is particularly important with re- 
gard to the first hypothesis, which raised the 
question of the amount of correlation be- 
tween S-D and F and K, because the S-D 
scale was constructed from an item pool based 
on these scales and the Taylor Anxiety scale. 
Table 1 shows the amount of item overlap 
between the S-D scale and each of the regular 
MMPI scales. 

It seems evident from inspection of Table 1 
that item overlap is not sufficient to establish 
substantial correlations between S-D and the 
validity and clinical MMPI scales, except for 
F. In that instance the 75 per cent overlap 
would assure a substantial correlation be- 
tween the two only as long as the direction 
of scoring of the overlapping items was con- 
sistent. The fact that all 48 F items occurring 
in S-D are scored in a single direction on 
S-D, by unanimous judgment of 10 judges, is 


itself indicative of a correlation of F with 
S-D. In order to be able to evaluate sepa- 
rately the effects of S-D and F, the item over- 
lap between the two scales was eliminated in 
a subscale of S-D, by scoring all records sepa- 
rately for the 31 S-D items not common to F. 
This subscale is designated as, S-D-F. 

Table 2 gives the correlations between S-D 
and S-D-F and the clinical and validity scales 
of the MMPI, and between F and K and 
these scales for the 97 test records. Table 2 
indicates S-D has a highly significant correla- 
tion with both F, — .82, and K, .69, confirm- 
ing the first hypothesis. The fact that S-D 
correlates higher with each of these scales 
than they do with each other (— .55) seems 
to indicate at least one common factor under- 
lies F, K, and S-D, and that S-D better esti- 
mates this common factor than does either 


Table 2 


Correlation of S-D and Validity Scales with Clinical 
and Validity Scales, Without K Correction 








Comparison Scales 








MMPI — 

Scales S-D —_S-D-F F K 
L 24* 26** 10 30** 
PF — 82** —60** —55** 
K 69** 71** —55** 
Hs —70**  —49* si** —60* 
D -o** —70* so** | —45** 
Hy —33** —47* w.. «8 
Pd —Si** —33** so** 9 —41** 
Mf —21 —33** 35** —23* 
Pa —oo** —47* 62** —32** 
Pt —86** —65** eo** = —75** 
Se —91**  —62** s4** =—70** 
Ma —so** —30** 49** = —S0** 
Meant —651** —498%*  566** —479** 





* Significant departure from zero at 5% level. 
** Significant departure from zero at 1% level. 
+ Nine clinical scales only. 
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Fig. 1. Z’ values of the Wheeler psychotic factor, S-D, F, and K. Note that Wheeler presented no 
loading of his factor on Hy. 


F or K. The correlation of — .60 between 
S-D-F and F clearly indicates that when the 
F items are removed from S-D there is still 
a highly significant relationship. 

The second hypothesis, that S-D will cor- 
relate significantly with the clinical scales, is 
also substantiated by the correlations pre- 
sented in Table 2. The mean correlation be- 
tween S-D and the nine clinical scales of 
— .651 is very highly significant. Only the 
Mf scale has a correlation which is not sig- 
nificant. The mean correlation of S-D-F with 
the clinical scales of — .498 is also highly 
significant, again indicating that S-D is some- 
thing more than the F scale. Since S-D-F at 
least roughly approximates S-D in size and 
pattern of correlations in Table 2, in the in- 
terest of brevity it will receive no further 
consideration. 

The pattern or profile of correlations in 
Table 2 is quite similar in shape for S-D, F, 
and K. This lends further support to the no- 
tion that there is an underlying factor com- 
mon to those three scales. Since the correla- 
tions between these three scales and the 


clinical scales are so high, it seems evident 
the factor is well reflected in several of the 
clinical scales. At this point a study was made 
of one factor analysis of MMPI records to 
investigate further this apparent underlying 
factor. 

Wheeler, e¢ al. (8), using a sample of 112 
male veteran hospitalized psychiatric patients, 
intercorrelated and factor analyzed a pool of 
MMPI records. This resulted in two factors 
labeled, Psychotic and Neurotic. For the pur- 
poses of this study only the first Wheeler fac- 
tor will be considered. This Psychotic factor 
was characterized thus: “. . . when this fac- 
tor is present to a marked degree in an indi- 
vidual, the usual ego-defensive mechanisms 
are held in abeyance and the person now 
tends to show himself in the worst possible 
light” (8, p. 139). This obviously resembles 
a definition of social undesirability. If that is 
the case, the loadings of the clinical scales on 
this factor should resemble the distribution of 
correlations between the S-D scale and the 
clinical scales. Since Wheeler’s factor loadings 
were obtained by orthogonal rotations, it be- 
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comes feasible to treat his factor loadings 
as correlations, permitting direct comparison 
with the correlations presented in Table 2. 
In order to correct for the skewed distribu- 
tion of r and pull apart the distributions of 
F, K, S-D, and the Psychotic factor on the 
higher correlations, these data are presented 
as 2’ transformations of the original correla- 
tions. They are presented in Figure 1 in 
graphic form, with the signs of F and the 
Psychotic factor changed to negative to make 
visual comparison easier. 

The striking thing about Figure 1 is the 
remarkable similarity in shape of the four 
distributions, F, K, S-D, and the Psychotic 
factor. It seems evident that each of the four 
is well saturated with some common com- 
ponent. It will be noted from Figure 1 that 
S-D more closely approximates the shape and 
elevation combined of Wheeler’s factor than 
does either F or K. This greater conformity 
of S-D is indicated by the sum of the squares 
of the differences of z’ values between each of 
S-D, F, and K with the Psychotic factor 
across the nine clinical scales. This crude 
measure of conformity yielded values of 1.53 
for K, 1.03 for F, and .52 for S-D. The 
smaller value for S-D clearly indicates less 
discrepancy between the z’ values of S-D and 
Wheeler’s factor than for either F or K. This 
correspondence is some joint function of shape 
and elevation. The greatest gap between S-D 
and F and K in relation to the Psychotic fac- 
tor is on the psychotic scales of Pa, Pt, and 
Sc. Thus, S-D better represents the factor 
than either F or K where the factor plays its 
biggest role. 

These findings would seem to indicate that 
much of the approach to the MMPI which F 
and K attempt to characterize can better be 
accounted for by a single factor. This under- 
lying factor is better estimated by the S-D 
scale than by either F or K. Such an inter- 
pretation is in line with the general findings of 
the studies cited above in which it has been 
noted that most of the variance on objective- 
type personality tests can be accounted for in 
terms of social desirability. 

The obvious heavy saturation of several of 
the clinical scales with what Wheeler terms a 
Psychotic factor but which here would seem 
to be as well labeled as social-undesirability, 
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suggests that mental illness, in a generalized 
sense of the term, may be characterized as 
involving behavior which is socially disap- 
proved. This is indicated over a fairly wide 
range of symptom-descriptive items, but par- 
ticularly on the psychotic scales of Pa, Pt, 
and Sc. 


Summary 


A scale of social desirability (S-D) was 
constructed from the MMPI, based on unani- 
mous judgments of 10 judges. It was hy- 
pothesized that this S-D scale would correlate 
significantly with the F and K validity scales 
of the MMPI and also with the nine clinical 
scales. Both hypotheses were confirmed. 

The S-D scale correlated highly with F and 
K and higher with each of them than they 
do with each other. This suggests that a com- 
mon factor underlies the three and that S-D 
is a better estimate of this factor. It was con- 
cluded that test-taking attitudes toward the 
MMPI can be characterized as readiness or 
lack of readiness to respond to socially desir- 
able items. 

The S-D scale had a very highly signifi- 
cant average correlation with the nine clinical 
scales. The correlations were generally higher 
but the pattern very similar in shape to the 
pattern between F and K and the clinical 
scales, again implying a common underlying 
factor. Findings from a factor analysis done 
elsewhere were compared with the data from 
this study. It was demonstrated that the dis- 
tribution of loadings among the clinical scales 
of a Psychotic factor was quite similar to the 
correlation profile with the clinical scales of 
S-D, F, and K. S-D was a better estimate of 
this Psychotic factor than either F or K. It 
was concluded that a common factor under- 
lies many of the clinical scales, particularly 
Pa, Pt, and Sc, and that it may be charac- 
terized as social desirability. 

The findings of this study support the evi- 
dence from other studies cited that much of 
the variance of objective-type personality in- 
ventories can be accounted for in terms of a 
dimension of social desirability. The findings 
also appear to reflect society’s well-known 
negative attitudes toward mental illness of 
labeling it as socially undesirable behavior. 


Received August 23, 1955. 
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The Effect of Manifest Anxiety on a Concept 
Formation Task, a Nondirected Learning Task, 
and on Timed and Untimed Intelligence Tests 
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Incidental to a validational study of the 
Taylor personality scale of manifest anxiety 
(MAS) (9), data were obtained about the 
following manifest anxiety correlates which 
have been discussed in the literature. 

1. Goldstein has stated that anxiety inter- 
feres with abstract thinking (1). A recent 
study by Wesley (12) failed to support this 
hypothesis. However, the author as well as 
reviewers (6) have pointed out that the re- 
sults of her study are by no means conclusive. 

2. Recent research suggests that stress and 
anxiety interfere with nondirected or inci- 
dental learning (10, 11). 

3. A number of investigations indicate that 
time pressure has a greater disruptive effect 
on anxious than on nonanxious Ss (2, 8). 


Procedure 


Thirty-five male medical and psychiatric 
patients were administered the Taylor MAS 
(7), the Wechsler Adult Intelligence Scale 
(WAIS), Raven’s Progressive Matrices (PM), 
the Bender Gestalt (BG) and the BG Recall 
Test (5). All Ss were at least of normal in- 
telligence and none was psychotic or sus- 
pected of having cortical damage. 

In order to determine the effect of mani- 
fest anxiety on abstraction, Ss’ MAS scores 


1From the Clinical Psychology Section of the 
Neuropsychiatric Service at the Veterans Adminis- 
tration Hospital, Bronx, N. Y. The author wishes 
to express his appreciation to Dr. H. L. Flowers, 
Chief of Neuropsychiatry, for his interest and sup- 
port and to Dr. R. S. Morrow, Chief Clinical Psy- 
chologist, and Dr. Julia C. Hall, Assistant Chief 
Clinical Psychologist in charge of research, for their 
encouragement and help. 


were correlated with their scores on PM, 
which involves the eduction of patterns and 
relations. 

As a test of nondirected learning the Recall 
phase of the BG was used. In the learning 
phase, usually perceived as a drawing ability 
test, Ss were instructed to copy nine designs, 
which were presented one at a time. Follow- 
ing this phase of the test, Ss were asked to 
reproduce as many designs as they could re- 
call. For experimental control purposes the 
same test was administered to a group of 33 
similar Ss, with directions to memorize the 
designs as they were copying them because 
they would be asked to recall them. Time 
spent on the learning phase was the same for 
the two groups. The MAS scores of both 
groups were correlated with their scores on 
the Recall Test. 

Finally, in order to determine the effect of 
time pressure on anxious and nonanxious Ss, 
the WAIS was divided into two parts: time- 
limited subtests (arithmetic, digit symbol, pic- 
ture completion, picture arrangement, block 
design, and object assembly) and subtests 
with no time limit (all the remaining sub- 
tests). The performance of the 10 highest 
Taylor scorers (35 and higher) and of the 10 
lowest Taylor scorers (15 and less) on these 
two sets of subtests were compared. 


Results 
The correlation between MAS scores and 
PM scores was r = — .41, which for 34 df is 


significant beyond the .01 confidence level. 
For the nondirected learning group the cor- 
relation between MAS scores and BG Recall 
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The Effect of Manifest Anxiety on Tasks 


Table 1 


Mean Scores on Timed and Untimed WAIS Subtests in Relation to Taylor MAS Scores 








Mean Mean 
timed untimed 
Group N tests SD tests SD tt 
High Taylor Scorers 10 10.10 1.81 11.48 2.15 2.51* 


Low Taylor Scorers 10 11.71 





* Significant at the .05 confidence level. 
+ Fisher's ¢ for correlated means. 


Test scores was r = — .44, which is signifi- 
cant beyond the .01 confidence level. How- 
ever, for the directed learning group the 
correlation was r = .07 which is clearly not 
significantly different from zero. 

Finally, Table 1 indicates that only the 
high MAS scorers had significantly lower 
scores on the time-limited subtests of the 
WAIS. 

Discussion 

The results of this study seem self-ex- 
planatory. However, the difference between 
the findings of the present study and those of 
Wesley (12) invites discussion. Whereas in 
the present study a negative correlation be- 
tween MAS scores and PM scores was ob- 
tained, in the Wesley study the high MAS 
scorers did consistently though not signifi- 
cantly better on the Wisconsin Card Sorting 
Test (WCST), also a test of abstraction. 
That anxiety is not always a disruptive force 
and actually in some cases may be a motivat- 
ing and facilitating force, has been pointed 
out by philosophers and psychologists (3, pp. 
226-234). This differential effect of anxiety 
leads to the obvious but very important ques- 
tion: when does anxiety act as a disruptive 
force, and when does it act as a motivating 
and facilitating factor? One possible reply is 
that it depends on the degree of anxiety. Up 
to a certain point anxiety may be a facilitat- 
ing force, but beyond this optimum point anx- 
iety becomes a disruptive force. It is quite 
probable that the Ss of the present study, 
being hospitalized patients, had more than 
the optimum level of anxiety. Also, whereas 
in the Wesley study the correct response was 
reinforced, no such reinforcement took place 
in the PM. Lack of reinforcement, especially 
on a task which was part of a diagnostic 





2.12 10.86 2.03 1.24 


evaluation, may have increased Ss’ anxiety 
beyond the optimum level. 

Recent research suggests that the effect of 
anxiety upon performance depends upon the 
difficulty of the task involved or, more pre- 
cisely, upon the initial relative strengths of 
correct and incorrect response tendencies in 
the situations (4, 12). Thus the difference be- 
tween Wesley’s and the present findings may 
be due to the fact that the WCST is a much 
simpler task than the PM, and therefore less 
subject to disruption by anxiety. Moreover, 
the facilitating effect of anxiety in the WCST 
may be due to the fact that during the learn- 
ing phase of the WCST the correct response 
is reinforced which increases the 
strength of the correct response. 

Preliminary research suggests that motiva 
tion to do well is another important variable 
in determining whether anxiety will act as a 
disruptive or a facilitating force. The hospital 
files contained 14 high Taylor MAS scorers 
(35 and higher) whose average standard score 
on anxiety-sensitive tasks (arithmetic, digit 
span, digit symbol, block design, and object 
assembly of the WAIS, BG Recall Test and 
PM) (9) was below their average score on 
the remaining subtests of the WAIS. How- 
ever, the hospital files also contained 9 high 
Taylor scorers whose average standard score 
on the anxiety-sensitive tasks was above their 
average standard score on the tasks not sensi- 
tive to anxiety. An investigation of the be- 
havioral descriptions in the psychological re- 
ports about these two groups of Ss revealed 
that whereas a majority of the good perform- 
ers on the anxiety-sensitive tasks were de- 
scribed as cooperative, well motivated, and 
interested in the tests, a majority of the poor 
performers on the anxiety-sensitive tasks were 
described as apathetic, poorly motivated, and 
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Table 2 


Test Performance of High Taylor MAS Scorers in 
Relation to Motivation 











Superior Inferior 
performance performance 

Motivation as reported on anxiety on anxiety 

in diagnostic report tests tests 
Cooperative, interested, 

well motivated 6 1 
Reluctant, poorly moti- 

vated, apathetic 0 10 
No description of Ss’ 

motivation 3 3 





Note.—The chi-square value of the distribution in this table 
is 13.10, which for 2 df is significant beyond the .01 confidence 
level. However, due to the low frequencies this probability is 
merely suggestive. 


reluctant to take the tests. Table 2 gives a 
more detailed analysis of the behavioral de- 
scriptions of the two groups. These results 
suggest that one’s need and desire to achieve 
is an important variable in determining 
whether anxiety is going to have a disruptive 
or a facilitating effect. Thus the difference 
between Wesley’s and the present findings 
may be due to the fact that the Wesley popu- 
lation consisted of college students who prob- 
ably were much more motivated to do well 
on a concept-formation task than was the 
population of the present study, and there- 
fore anxiety did not act as a disruptive force. 


Summary 


Thirty-five psychiatric and medical pa- 
tients were administered the Taylor MAS, 
the WAIS, the PM, the BG with the usual 
instructions, and the BG Recall Test. An- 
other group of 33 similar Ss was adminis- 
tered the Taylor MAS, the BG, with instruc- 
tions to learn the designs, and the BG Recall 
Test. 

The correlation between MAS scores and 
scores on the PM was r = — .41. In the non- 
directed learning group the correlation be- 
tween MAS scores and scores on the BG Re- 
call Test was r = — .44, but in the directed- 
learning group the correlation was r = .07. 


Finally, Ss who received high scores on the 
MAS obtained significantly lower scores on 
the timed than on the untimed subtests of the 
WAIS. 

These findings suggest that anxiety has a 
disruptive effect on abstraction, incidental 
learning, and timed intelligence tests. 

It was suggested that the nature of the 
task, the degree of anxiety, and motivational 
factors are important variables in determin- 
ing whether or not anxiety will have a dis- 
ruptive effect. 


Received July 19, 1955. 


References 


1. Goldstein, K. The organism. New York: Ameri- 
can Book Co., 1939. 

2. Matarazzo, J. D., Ulett, G. A., Guze, S. D., & 
Saslow, G. The relationship between anxiety 
level and several measures of intelligence. J. 
consult. Psychol., 1954, 18, 201-205. 

3. May, R. The meaning of anxiety. New York: 
Ronald, 1950. 

4. Mayzner, M. S. A critical review of recent ex- 
perimental studies relating anxiety to learn- 
ing. Psychol. Newsltr, 1954, 5, 117-137. 

5. Peak, R. M., & Quast, W. A. A scoring system 
for the Bender-Gestalt. Hastings, Minn. (Box 
292) and Minneapolis, Minn. (2810 42nd St.): 
Authors, 1951. 

6. Taylor, D. W., & McNemar, Olga W. Problem 
solving and thinking. Annu. Rev. Psychol., 
1955, 6, 455-482. 

7. Taylor, Janet A. A personality scale of manifest 
anxiety. J. abnorm. soc. Psychol. 1953, 48, 
285-290. 

8. Sarason, S. B., Mandler, G., & Craghill, P. G. 
The effects of differential instructions on anx- 
iety and learning. J. abnorm. soc. Psychol., 
1952, 47, 561-565. 

9. Siegman, A. W. Cognitive, affective, and psycho- 
pathological correlates of the Taylor MAS. J. 
consult. Psychol., 1956, 20, 137-141. 

10. Siegman, A. W. Some effects of mild electric 
shock on undirected learning. Paper read at 
East. Psychol. Ass., Atlantic City, March, 
1956. 

11. Silverman, R. E. Anxiety and the mode of re- 
sponse. J. abnorm. soc. Psychol., 1954, 49, 
538-542. 

12. Wesley, Elizabeth L. Perseverative behavior in a 
concept-formation task as a function of mani- 
fest anxiety and rigidity. J. abnorm. soc. Psy- 
chol., 1953, 48, 129-134. 


atl: 32 Fae 


0M oe IE 0 








——— oe a, a a a ee a 


yr = — —. —, 





Bini s? Cait Pan! 


Dane ih 


ireitRZ 





Journal of Consulting Psychology 
Vol. 20, No. 3, 1956 


The Influence of Ego-Involvement on Relations 
Between Authoritarianism and Intolerance 
of Ambiguity’ 


Anthony Davids 


Brown University and Bradley Home 


After repeated failure to replicate previ- 
ously reported findings between ethnocentric- 
ism and rigidity (19), Brown (3) discovered 
that a crucial variable to be considered in this 
research was the condition under which the 
subjects (Ss) were assessed. He found that 
when the Ss were administered the California 
F scale and the Einstellung arithmetic prob- 
lems in a friendly, casual, relaxed atmosphere 
there was no relation between authoritarian- 
ism and rigidity.* However, when the Ss 
were administered these same measures under 
formal, somewhat threatening, ego-involved 
conditions there was a significant positive as- 
sociation between authoritarianism and prob- 
lem-solving rigidity. On the basis of these 
findings, Brown concluded, “An adequate op- 
erational definition of this rigidity must in- 
clude the establishment of an ego-involving 
testing atmosphere. The ‘same’ measure of 
rigidity employed in a relaxed testing atmos- 
phere produces scores which are not related 
to authoritarianism” (3, p. 475). 

In a recent paper, Davids (6) reported 
finding no relation between authoritarianism 


1 This study was carried out at the Harvard Psy- 
chological Clinic. It was supported in part by the 
Laboratory of Social Relations and in part by grant 
M-700 from the National Institute of Mental Health, 
Public Health Service. 

2Since the F scale (authoritarianism) and the E 
scale (ethnocentricism) have been shown to correlate 
.77 (1) they are frequently used interchangeably and 
results are often generalized from one scale to the 
other. Also, the concepts “rigidity” and “intolerance 
of ambiguity” have been used interchangeably (11, 
12) and both have been applied to high authori- 
tarians. In the present report we will not attempt to 
differentiate between these concepts. 


and intolerance of ambiguous visual or audi- 
tory stimuli. Since the Ss in the study were 
volunteers participating in a long-term re- 
search project, were examined under excep- 
tionally comfortable, nonthreatening condi- 
tions, and were guaranteed anonymity, it 
seems possible that the negative findings 
might be attributable to the assessment con- 
ditions. The present experiment is an attempt 
to discover if a highly ego-involving atmos- 
phere will lead to significant relations between 
authoritarianism and ambiguity tolerance. 
Specifically, there are two main purposes in 
this investigation: one is to see if the previ- 
ously reported positive findings (6) of signifi- 
cant relations between authoritarianism and 
maladjustment can be replicated; the other 
is to see if the negative findings in regard to 
relations between authoritarianism and in- 
tolerance of ambiguity will be found when the 
Ss are highly ego-involved in the experimental 
tasks. In view of the import of authoritarian 
personality theory (1) and the controversial 
findings that have been reported by investiga- 
tors who have worked with this theory (3, 4, 
6, 9, 14, 17), the attempt to replicate previ- 
ous studies and to clarify the contradictory 
empirical evidence seems particularly war- 


‘ranted. 


Method 
Subjects 


Twenty-two male undergraduates were se- 
cured from a college employment office. They 
were well-matched with the 20 Ss used in the 
previous study (6) in regard to such vari- 
ables as socioeconomic background, religion, 
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academic performance, and extracurricular ac- 
tivities. They differed from the previous group 
of Ss in that none of them were psychology 
majors, they had not volunteered to take psy- 
chological tests, and all were registered at the 
employment office as seeking employment. 


Experimental Treatment 


A staff psychologist contacted the student 
employment office and said that he would like 
to interview approximately 20 students in or- 
der to select a few research assistants. Indi- 
vidual appointments were arranged, and the 
psychologist conducted a personal interview 
with each prospective employee. The inter- 
view took place in the psychologist’s office 
and a formal, serious attitude was maintained 
throughout the interview. An attempt was 
made to uncover basic information about the 
person, as well as getting him ego-involved in 
the assessment procedure. Each S$ was told 
that we planned to hire one or two students 
to assist us with psychological research proj- 
ects and that the persons selected for the po- 
sitions would be paid at a considerably higher 
rate than the usual student pay scale. After 
indicating strong interest in securing the po- 
sition, as all of the 22 Ss did, the S was told 
that a battery of psychological tests would be 
administered and that our selection would be 
based on the test results. Thus, it is evident 
that these individuals took the experimental 
measures under conditions that were far from 
relaxed. They needed work, a good job de- 
pended upon their test performance, and the 
tests were administered in a very formal, com- 
petitive, business-like atmosphere. It seems 
justified to conclude that these men were 
highly ego-involved in the task at hand. Fol- 
lowing completion of the test battery, the Ss 
were paid for their time, and shortly there- 
after each person was sent a personal letter 
informing him that owing to failure of a re- 
search grant to come through we would be 
unable to add any assistants to our staff. 


Measures 


Authoritarianism. The Ss were adminis- 
tered the standard 30-item F scale (1). Total 
scores ranged from a low of 63 to a high of 
122, and the mean score per item was 3.14. 


Ego structure. On the basis of the personal 
interviews, the clinical psychologist rank-or- 
dered the Ss on “ego structure.” A detailed 
definition of this concept has been presented 
by Murray and Kluckhohn (18) who indi- 
cate that good ego structure is highly corre- 
lated with good personal and social adjust- 
ment. In the present study, the psychologist 
did not have a great deal of information on 
which to base his evaluations, and it should 
be recognized that there is considerable room 
for error in this ranking. However, the psy- 
chologist knew that he would have to rank- 
order the Ss on this dimension and tried to 
secure information and clinical impressions 
that he deemed pertinent to an evaluation of 
ego structure. At the completion of each in- 
terview, he made notes and assigned ratings 
on several personality variables and a rating 
on ego structure. He then used this material 
to assist him in making his final judgments 
of the Ss’ relative standing on “ego struc- 
ture.” 

Manifest anxiety. The Ss were administered 
the Taylor scale of manifest anxiety (20). 
Scores ranged from 1 to 29, with a mean of 
12. 

Psychosomatic inventory. The Ss were ad- 
ministered this inventory which is supposed 
to provide a measure of neuroticism (15). 
High scores indicate normality and low scores 
indicate neuroticism. Scores for the present 
Ss ranged from 37 to 364, with a mean of 254. 

Academic achievement. Transcripts of the 
Ss’ academic records were secured from the 
registrar’s office, and they were rank-ordered 
according to their grade-point average. These 
averages ranked from A— to D+, with a 
mean performance at the B — level. 

Reactions to ambiguous visual stimuli. The 
Ss were individually administered the ink- 
blot concepts from the McReynold’s Concept 
Evaluation Technique (16) as modified by 
Eriksen (8, 9). Detailed description of this 
procedure is presented elsewhere (6, 8, 16). 
Briefly, the procedure consists of pointing out 
various Rorschach concepts to the S and ask- 
ing, “Could this be a ——-?” Half of the 50 
concepts are scored plus and half are scored 
minus according to Beck’s (2) frequency 
tables. In the present experiment, the num- 
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ber of rejections of the 50 concepts (i.e., say- 
ing “No, it could not be”) ranged from 11 to 
35, with a mean rejection score of 23. 

Reactions to ambiguous auditory stimuli. 
The Ss were administered the Azzageddi Test, 
which is an auditory projective technique 
consisting of passages of spoken communica- 
tion containing contradictory and irreconcil- 
able statements and ideas (6, 7). Although 
each statement, by itself, is meaningful and 
coherent, when several statements are inter- 
mingled into a passage of speech, there is 
much confusion and contradiction inherent in 
the passage. Consequently, the S is confronted 
with confusing and contradictory ideas and 
asked to recall as many of the ideas as he 
can from each passage. The total number of 
phrases and statements in the test is 112. 
The number of items recalled by the present 
Ss ranged from a low of 35 to a high of 70, 
with a mean recall score of 55. After hearing, 
and recalling, the eight passages which con- 
stitute this test, the Ss were presented with 
sheets on which they could indicate their per- 
sonal reactions to the auditory projective test. 
On 6-point rating scales, they indicated the 
degree of ambiguity they perceived in the 
spoken material, and the degree of satisfac- 
tion or dissatisfaction they experienced while 
attempting to cope with the demands of this 
auditory test. 


Predictions 


The predictions were the same as those 
tested in the earlier study (6) where a de- 
tailed discussion of the rationale and back- 
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sented. Briefly, these predictions were derived 
from the theory of authoritarian personality 
(1) and from previous experimental findings 
(10, 11, 12, 13, 19). Specifically, it was pre- 
dicted that authoritarianism would be posi- 
tively associated with the number of rejec- 
tions of Rorschach concepts, manifest anxiety, 
and neuroticism, and would be negatively as- 
sociated with the number of ideas recalled on 
the auditory test, grade-point average, and 
ego structure. That is, we expected that Ss 
who score relatively high on authoritarianism 
(F scale) would be: (a) less intelligent (low 
grade-point average), (6) more anxious and 
maladjusted (high Taylor score, low P-S In- 
ventory score, low rank on ego structure), 
and (c) more intolerant of ambiguity (high 
rejection of ambiguous Rorschach concepts 
and low recall of ideas contained in ambiguous 
spoken communications). 


Results and Discussion 


The results presented in Table 1 indicate 
that only one prediction is confirmed. There 
is a significant negative correlation between 
scores on the F scale and intelligence as meas- 
ured by academic achievement. This finding 
of a negative relation between authoritarian- 
ism and intelligence is in agreement with find- 
ings reported in several previous investiga- 
tions (1, 5, 6, 13). 

Turning to the failure to find a signifi- 
cant correlation between authoritarianism and 
measures of adjustment, it seems highly prob- 
able that the condition under which the Ss 
were assessed does not permit a fair test of 














ground of theoretical formulation is pre- the predicted relation. Since these Ss thought 
Table 1 
Rank-Order Intercorrelations Among Experimental Measures 
(N =22) 
Ego College Auditory Rorschach Manifest P-S 
Measure structure grades test test anxiety Inventory 
F scale —.23 —.0** +.10 +.30 +.25 —.27 
Ego structure —.14 +.22 +.22 — 36" + .A41* 
College grades +.04 —.21 — .08 — O01 
Auditory test — .06 — 01 + .02 
Rorschach test — 31 +.25 
Manifest anxiety an pee 





* Significant at the .05 level for a one-tailed test. 
** Significant at the .01 level for a one-tailed test. 
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the test results would be used for psychologi- 
cal screening, they undoubtedly distorted their 
responses to the questionnaires designed to 
measure anxiety and neuroticism. Statistical 
test of the differences between the Taylor 
scale and P-S Inventory scores of the present 
Ss and the scores obtained by the research Ss 
in the previous study show that on both 
measures the mean scores in the present ex- 
periment indicate significantly (p = .01) less 
anxiety and neuroticism. In the earlier study 
(6), highly significant relations were found 
between authoritarianism, manifest anxiety, 
and weak ego structure. In view of the fact 
that the two inventory measures of personal 
adjustment were significantly different in the 
present experiment, and the fact that the 
present clinical evaluation of ego structure 
was based on less information and informa- 
tion gained in an interview in which the Ss 
were trying to make a favorable impression, 
there is no doubt that the test of our hy- 
pothesized relation cannot be considered a 
crucial test. It is interesting to note, however, 
that the three measures of adjustment—ego 
structure, manifest anxiety, and neuroticism 
on the P-S Inventory—intercorrelate signifi- 
cantly. Thus, it appears that if the Ss were 
consciously trying to create an impression of 
good personal and social adjustment, as we 
infer, they were able to present a consistent, 
though distorted, picture. Under this condi- 
tion of high ego-involvement, with much to 
be gained by presentation of a mature, well- 
adjusted, personality picture, we did not find 
a significant relation between the measures 
of authoritarianism and the measures of per- 
sonality. However, because of the peculiar na- 
ture of the assessment condition, which was 
necessary for testing the effect of ego-involve- 
ment on relations between authoritarianism 
and intolerance of ambiguity, we cannot ac- 
cept the present negative findings as conclu- 
sive evidence. 

For one thing, it should not be overlooked 
that each of the measures of adjustment cor- 
related in the predicted direction with the 
measure of authoritarianism. All these corre- 
lations are in the vicinity of .25 and, though 
they are not statistically significant and are 
not of equivalent magnitude to those obtained 
in the earlier study, the fact that we secured 


even this magnitude of relation between our 
measures, as predicted, makes one think that 
there may well be a relation between authori- 
tarianism and maladjustment. In view of the 
fact we could not replicate our previous find- 
ings, however, together with the series of 
negative findings reported by another investi- 
gator (17), it seems highly probable the rela- 
tion between authoritarianism and indices of 
personal adjustment may be of limited gen- 
erality. That is, the type of Ss being ex- 
amined, the setting in which they are ex- 
amined, rapport, ego-involvement, and a host 
of additional variables may have important 
effects on this proposed relation. Very defi- 
nitely, there is need for further systematic 
experimentation, with more conclusive find- 
ings, before it would seem justified to con- 
clude that there is a general relation between 
authoritarianism and personality maladjust- 
ment. 

At this point, let us consider the failure to 
confirm the prediction concerning the relation 
between authoritarianism and intolerance of 
ambiguity. The assessment condition was de- 
signed expressly to permit further test of 
Brown’s (3) finding that under high ego- 
involvement there was a significant correla- 
tion between the F scale and measures of 
rigidity or intolerance of ambiguity. The re- 
sults shown in Table 1 provide definite evi- 
dence that, in the present study, the hy- 
pothesis was not confirmed. Chi-square tests 
of association between authoritarianism and 
personal reactions to the auditory projective 
test support this conclusion. There was no 
significant relation between the F scale and 
ratings of degree of ambiguity perceived in 
the auditory stimuli (x? = .19) and no sig- 
nificant relation between the F scale and rat- 
ings of degree of liking or disliking for the 
auditory test (x? = .74). In other words, even 
though in this experiment the Ss were ex- 
amined under a condition of exceptionally 
high ego-involvement, which should be con- 
ducive to securing the predicted positive re- 
lation, we again found no significant associa- 
tion between authoritarianism and measures 
of tolerance of ambiguous visual stimuli and 
ambiguous auditory stimuli. Thus, in two in- 
dependent studies we have been unable to 
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secure results in keeping with predictions de- 
rived from reports of previous research. 

Two points should be emphasized in re- 
gard to the present findings. The first is that 
the F scale scores do not appear to be con- 
sciously distorted. Contrary to the findings of 
significantly less manifest anxiety and neu- 
roticism in these Ss than in the research Ss 
in the earlier study, the mean F scale score 
for these Ss is actually slightly higher than 
the mean for the previous group. Thus, we 
infer that these Ss were not consciously try- 
ing to appear less authoritarian and that their 
F scale scores are probably valid. The second 
noteworthy point is that the correlation be- 
tween the F scale and number of Rorschach 
rejections is not far below the acceptable level 
of statistical significance and is considerably 
higher in magnitude than the comparable co- 
efficient obtained under non-ego-involved con- 
ditions. We certainly, can place little faith in 
nonsignificant findings, but we would like to 
interpret the present finding as at least sug- 
gestive that ego-involvement may be an im- 
portant determinant of the hypothesized re- 
lation. At any rate, further study of this 
proposed determinant seems both necessary 
and worthwhile. 

Since other investigators, employing a va- 
riety of experimental tasks, have also failed 
to find a significant relation between authori- 
tarianism and intolerance of ambiguity (9, 
14), we would advocate considerable caution 
and skepticism until the contradictory evi- 
dence has been reconciled. We do not yet 
know the conditions or variables that influ- 
ence this relation and until we have more un- 
equivocal experimental evidence it seems best 
to place qualifications on the general state- 
ment that authoritarians are rigid or intoler- 
ant of ambiguity. 

In conclusion, we would like to emphasize 
the need for replication studies in this area 
of research. In fact, more general practice of 
replication of psychological experiments would 
seem to be highly desirable. It appears that 
few investigators bother to repeat their own 
experiments and few have the desire to spend 
their research time and energy doing an ex- 
periment that has already been done by some- 
one else. However, when exact replications or 
slightly modified extensions of previous stud- 
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ies result in similar positive findings one 
could then feel much more confident of the 
generality and consistency of the phenomena. 
And when several independent studies, which 
should logically bear upon common phe- 
nomena, yield negative or inconsistent find- 
ings the need for further experimentation and 
restricted generalization would become ap- 
parent. 


Summary 


The present study was concerned with rela- 
tions between authoritarianism, intelligence, 
personal adjustment, and intolerance of am- 
biguity. This experiment is an extension of an 
earlier study in which Ss who were relatively 
high on authoritarianism were found to be 
relatively low on intelligence, high on mani- 
fest anxiety, and low on ego structure. Con- 
trary to expectation, however, in the previous 
study no relation was found between authori- 
tarianism and intolerance of ambiguous visual 
and auditory stimuli. Since recent experimen- 
tal evidence suggested that ego-involvement 
may be an important determinant of the re- 
lation between authoritarianism and rigidity 
or intolerance of ambiguity, the present in- 
vestigation was designed to see if positive 
findings would be obtained when the Ss were 
examined under highly ego-involving condi- 
tions. 

Twenty-two male undergraduates under- 
went a battery of tests and personnel assess- 
ment procedures under the assumption that 
the results would be used in selecting some- 
one for a highly desirable job. These Ss were 
registered at the student employment office, 
were in need of work, and were eager to pass 
the psychological screening. Thus, it seems 
warranted to conclude that the assessment 
conditions were seen as formal, somewhat 
threatening, and highly ego-involving. In 
cluded in the assessment battery were a per- 
sonal interview with a clinical psychologist, 
the F scale, the manifest anxiety scale, the 
P-S Inventory, Rorschach concept evaluation 
technique, and an auditory test consisting 
of ambiguous spoken communications. Also, 
grade-point averages were obtained from the 
Ss’ academic transcripts. 

The only prediction confirmed in this study 
was the finding of significant negative corre- 
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lation between authoritarianism and _ intelli- 
gence as measured by academic achievement. 
Authoritarianism correlated in the predicted 
direction with the three measures of adjust- 
ment (ego structure, manifest anxiety, and 
neuroticism), but none of these coefficients 
were statistically significant. However, since 
the two inventory measures of adjustment in- 
dicate significantly less anxiety and neurot- 
icism in the present Ss in comparison with 
previously studied Ss, it seems that the ego- 
involving conditions probably led to con- 
scious distortion of responses to these meas- 
ures of adjustment. Therefore, these negative 
findings should not be regarded as conclusive 
evidence. 

No significant relation was found between 
authoritarianism and rejection of Rorschach 
concepts or recall of ideas contained in am- 
biguous spoken passages. Supporting this 
negative finding, there was also no relation 
between authoritarianism and a direct meas- 
ure of ambiguity tolerance based on the Ss’ 
ratings of their personal reactions to am- 
biguous auditory stimulus material. 

It was concluded that relations between 
authoritarianism, personal adjustment, and 
ambiguity tolerance are far from completely 
understood. It was suggested that these rela- 
tions are probably of limited generality and 
that attention must be paid to the factors in- 
fluencing them. And the need for replication 
studies was emphasized. 


Received August 23, 1955. 
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Ideational Expression of Hostile Impulses’ 


Robert D. Wirt * 


Stanford University 


That hostility is of major importance within 
personality structure and feelings about hos- 
tility are essential problems in psychotherapy 
are beliefs widely held among contemporary 
theorists and clinicians. Although considerable 
research has been done during the past dec- 
ade or so on the expression of aggression, little 
fundamental research has been carried out to 
demonstrate the relative importance of hos- 
tility or its proper place in a comprehensive 
picture of personality dynamics. In fact it is 
difficult to find more than the bald statement, 
in most theories, that hostility is very impor- 
tant and that it is closely related to psycho- 
pathology. It is with this relationship be- 
tween psychopathology and hostility that the 
present work is concerned. 

It is commonly believed that the unabashed 
release of aggression is due to a breakdown of 
normal psychic control, or to a pathological 
increase in hostile elements of personality. 
Thus our laws provide special treatment for 
offenders whose crimes were committed dur- 
ing times of “temporary insanity.” Expres- 
sions such as “he must have been crazy to do 
a thing like that” usually have reference to 
destructive acts. 

Freud (8, 10, 11, 12, 13) put heavy em- 
phasis on hostility as a reaction to threat to 
security, though in later elaborations of psy- 
choanalytic theory (28) he shows hostility to 
derive from libido in combination with forces 
(the repetition compulsion) of the death in- 
stinct. The early formulation is greatly ex- 


1 This report is based in part on a dissertation sub- 
mitted to the Department of Psychology and the 
Committee on Graduate Study of Stanford Univer- 
sity, April, 1952. The writer wishes to express his 
appreciation to Professors C. L. Winder, Maud A. 
Merrill, and Douglas Lawrence for their generous 
help in this investigation. 
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panded by Dollard and his associates (3, 4, 
21), but hostility becomes quite simply an 
innate response to frustration. Neo-Freudian 
theories (8, 14, 16, 18, 19, 22, 23, 25, 27, 
28) view hostile expression more clearly in 
terms of interpersonal relations and most of 
these theorists see hostility as arising as a 
natural reaction to frustration with its con- 
trol and expression being governed by mores 
of the society. With a breakdown of tolerance 
for frustration (or “ego disorganization’’) the 
individual is expected to display increased 
aggression and increasingly inappropriate ex- 
pression of hostile behavior. Murray (22) 
explicitly states that there is a direct rela- 
tionship between hostility and psychopathol- 
ogy, neurotics being more hostile than nor- 
mals; and psychotics, he asserts, are most 
hostile of all. Cultural anthropologists and 
sociologists have stressed the differences in 
the freedom of expression of aggression among 
members of differing social groups (2, 29). 
Presumably this leads to differences in the 
degree to which hostility is a problem for 
members of these groups. 

In general, then, there is some agreement 
among various theorists that hostility is of 
central importance in psychopathology and 
that its expression is modified by social ex- 
perience. 

Several questions are raised by this sum- 
mary. If hostility is directly related to the de- 
gree of psychopathology, the broadest mean- 
ing of this assertion must be true, at the very 
least: normal individuals will express less hos- 
tile ideation than neurotics, who will express 
less than psychotics. Hostile ideation must be 
a precursor (or concomitant) of aggressive 
behavior since no matter how repressed the 
hostility may be, if it is really a problem it 
must eventually, no matter in how disguised 


- 
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a form, be expressed ideationally. Further- 
more, if the expression of aggression is bound 
by cultural controls one might reasonably ex- 
pect to find differences in the amount of hos- 
tile ideation expressed among different sex, 
age, educational, and socioeconomic groups; 
presumably women and persons more fre- 
quently and longer exposed to the societal 
values have more opportunity and more pres- 
sure to learn the control of hostility and less 
freedom for hostile expression. Beyond this 
there may be differences in the mode of such 
expression. 


Procedure 


Finney’s Palo Alto Aggressive Content Scale 
(6, 7, 26) was used as the measure of hostile 
ideation. This scale is based upon the Ror- 
schach content. Rorschach (24) originally 
suggested his technique as a test of ideation. 
It provides a standardized situation in which 
symbolic as well as less disguised thought 
processes are elicited. The Finney scale was 
used, instead of similar scales, such as that 
by Elizur (5) because its scoring is more ob- 
jective (15); it has distinguishable categories 
of hostility and it has been demonstrated to 
separate validly groups of known assaultive 
individuals from nonaggressive persons (26). 
Four categories for classifying aggressive re- 
sponses are provided: Derogatory Remarks, 
Victim of Destruction, Possibly Destructive, 
and Active Destruction. Any response can be 
categorized as nondestructive or as falling 
into one or two of the above categories of 
hostility. 

Three groups were taken for the sample: 
a group of 76 normal, 32 neurotic, and 50 
schizophrenic individuals. An effort was made 
to stratify the groups in such a way that they 
would compare with the general population 
in age, education, and socioeconomic status. 
The normal group contained 38 men and 38 
women, selected from the community among 
persons without psychiatric diagnoses. All in 
the abnormal groups were male patients at 
the Palo Alto Veterans Administration Hos- 
pital. These were individuals whose diagnoses 
were not complicated by known organic con- 
ditions or by admixture of other psychiatric 
reactions and for whom the Rorschach did not 
contribute in the diagnostic study. The char- 


Table 1 


Summary Data from Distribution of Raw Scores 














Schizo- 
Distribution of Normal Neurotic phrenic 
responses (N=76) (N=32) (N=50) 
Total number :* 
M 23.03 23.72 17.80 
SD 11.76 16.20 7.88 
Aggressive responses :} 
6.96 10.19 7.600 
SD 4.31 10.74 6.20 
Nonaggressive responses: 
M 16.39 14.31 11.32 
SD 9.55 9.73 8.05 
“Victim” responses : 
M 2.60 5.00 4.44 
SD 2.17 7.84 5.16 
“Aggressor” responses :t 
M 4.36 5.19 3.16 
SD 3.34 5.05 3.11 
* Since an aggressive response can be scored in more than 
one category, the total number of responses will not be the 


exact sum of number of aggressive plus number of nonaggressive 
responses, 
tT Aggressive responses equals the sum of “victim” plus 
“aggressor” responses. 
Aggressor responses equals the sum of Derogatory Remarks 
plus Possibly Destructive plus Active Destruction responses on 
the Finney Scale. 


acteristics of the sample are described more 
fully elsewhere.® 

The Rorschach protocol of each individual 
was scored by the Klopfer and Kelley system 
(17) and the responses rated on the Finney 
scale. Ratings of a subsample of 393 responses 
gave 95 per cent agreement between two in- 
dependent judges. 


Results 


Table 1 contains the summary data for the 
distribution of aggressive responses. The “to- 
tal number of responses” is R in Rorschach 
terms, and it includes all responses, both hos- 
tile and nonhostile; “number of aggressive 
responses” refers to the sum of all four cate- 
gories in the Finney scale; “number of non- 
aggressive responses” includes those responses 
not contained in one of Finney’s categories; 
number of “victim responses” refers to the 
category of Victim of Destruction responses 
in the scale; and number of “aggressor” 
responses” refers to the sum of all four cate- 
gories: Derogatory Remarks, Possibly De- 


8 Wirt, R. D. Pattern analysis of the Rorschach. 
To be published. 
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structive, and Active Destruction. An aggres- 
sive response can be scored in as many as 
two categories. In the sample used here there 
was an average of less than one-half response 
per record requiring double classification. 

The mean number of responses for the 
schizophrenic group, 17.80, is considerably 
less than that of the neurotic group, 23.72, 
or of the normal group, 23.03. The correla- 
tion between number of responses and num- 
ber of aggressive responses is .60 for nor- 
mals, .83 for the neurotic group, and .63 for 
the schizophrenic group. This suggests, as one 
might expect, that there is a tendency for an 
increasing number of aggressive responses to 
be associated with increased responsiveness. 
Total number of responses tends to be asso- 
ciated with any measure on the Rorschach 
that is quantitative, since the more responses 
an individual gives the more likely he is to 
increase the number of responses in all cate- 
gories. In order to partial out the influence 
of the number of responses, an analysis of co- 
variance was computed with number of ag- 
gressive responses and total number of re- 
sponses as variables. The same procedure was 
used to test the differences between pairs of 
groups. The analyses show demonstrable dif- 
ferences in the hostile ideation of these groups. 
Table 2 gives the results. 

As the results shown in this table indicate, 
each group differs from each other group, and 
there is a highly significant difference among 
the three groups. When the means are ad- 
justed according to the results of the covari- 
ance analysis by the method suggested by 
McNemar (20, p. 328), it can be seen (Col- 
umn 4, Table 3) that the groups do not fall 
into the predicted order. 

The control factors were tested by two 


Table 2 


F Ratios for Analysis of Covariance 
for the Three Groups 














Groups F Ratio 
Normal vs. Neurotic vs. Schizophrenic 6.69** 
Normal vs. Neurotic 8.69** 
Normal vs. Schizophrenic rE 
Neurotic vs. Schizophrenic 5.83* 





* Significant beyond the .05 level. 
** Significant beyond the .01 level. 


Table 3 


Means of Aggressive Responses 


Raw-score Adjusted Rank 

Group means means order 
Normal 6.96 6.36 1 
Neurotic 10.19 9.38 3 
) 


Schizophrenic 7.0) 9.07 





methods: analysis of covariance and a chi- 
square method suggested by Cronbach (1, p. 
411). This method successfully controls the 
effect of total number of responses when dif- 
ferences such as number of aggressive re- 
sponses are being studied among groups. Dis- 
tributions covering the entire range of each 
factor were made for three categories of age, 
four categories of education, and four cate- 
gories of socioeconomic status of each group 
and between the sexes for the normal group. 
None of these was related to the expression 
of hostile ideation in significant degree. In 
the case of the abnormal groups the relation- 
ship was even less pronounced than in the 
normal group, but in no group did either the 
chi square or the F ratio approach an ac- 
ceptable level of statistical significance. The 
Pearson correlation between age and number 
of aggressive responses was only — .04. 

Number of aggressive responses was divided 
into those representing “victim” and those 
representing “aggressor,” and Cronbach’s chi- 
square procedure applied. This method failed 
to show any difference in direction of puni- 
tiveness for the social classes. However a clear 
tendency for direction of punitiveness to be 
related to psychopathology was found. The F 
ratio for analysis of covariance for “victim” 
responses was 7.46, which for 2 and 154 de- 
grees of freedom, is significant beyond the 
001 level. The trend is for the normals to 
show most “aggressor” responses with the 
schizophrenics showing significantly more 
“victim” responses and the neurotic group 
falling in between and showing no character- 
istic direction. 


Discussion 


Clearly these results lead to the conclusion 
that it requires a complex analysis of person- 
ality factors in addition to hostility to classify 
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individuals, even grossly, along a continuum 
of psychopathology. Individuals in all of the 
groups were found throughout the entire 
range of ideational expression in all of the 
categories of aggression. It would seem that 
hostility cannot be taken necessarily to be a 
critical factor in the mental health of an in- 
dividual. But the group differences are sig- 
nificant and the severely ill individuals were 
shown to be more expressive of hostile idea- 
tion than the normal group. Furthermore, 
there was a definite tendency for the groups 
to show differences in the direction of expres- 
sion of hostile impulses. This investigation 
suggests that those theories relating hostility 
to psychopathology are substantially correct, 
but it will require further elaboration for a 
theory to account for individual differences 
and for the greater ideational expression of 
hostile impulses by neurotic individuals. 

Hostility certainly is a critical factor in our 
social intercourse. This research seems to 
confirm the assertions of some theorists that 
hostility is common, primitive, and pervasive. 
But few differences, and no significant ones, 
were found among different social, economic, 
age, or educational groups. This would sug- 
gest that the effectiveness of methods of so- 
cialization in the handling of these impulses 
leaves much to be desired. It may be that one 
group of individuals is as hostile as another, 
but that the behavioral expression of their 
hostility differs widely, while its ideational 
expression is similar. Given appropriate chan- 
nels of expression the impulse probably should 
not be labeled pathological, unless we wish 
to be in the untenable position of labeling 
nearly everybody as pathological. 


Summary 


Having reviewed a number of theoretical 
positions it was concluded that most con- 
temporary theorists believe hostility to be a 
crucial factor in the mental health of indi- 
viduals and to be subject to cultural pres- 
sures. Groups of normal, neurotic, and psy- 
chotic subjects of differing age, educational, 
and social status (including both men and 
women in the normal group), were compared 
on the amount of ideational expression of 
hostile impulses found on scoring their Ror- 
schach protocols using Finney’s Palo Alto 


Aggressive Content Scale. A statistically sig- 
nificant difference in hostile ideation was 
found to exist among the three groups and 
between any two. The normals showed the 
least hostility, next were the schizophrenics, 
and the neurotic group scored the most. The 
groups showed a significant difference in the 
direction of punitiveness, with the normal 
group tending to direct hostility outward, the 
schizophrenics inward, while the neurotic 
group showed no characteristic direction. 
None of the age, educational, social, or sex 
differences were statistically significant in 
the amount or direction of hostile ideation. 

The results were interpreted to mean that 
many contemporary theorists are in error in 
stating that hostility is directly related to 
profundity of illness and, further, they are in 
error in assuming that great hostility is neces- 
sarily an indication of pathology. However, it 
was emphasized that the results do clearly 
demonstrate a tendency for greater expres- 
sion of hostility among groups of severely 
mentally ill individuals as compared with nor- 
mals and that the direction of expression 
characteristically differs. It was pointed out 
that the lack of support for theories claiming 
that hostility is amenable to social pressure is 
not, per se, proof that cultural, pedagogic, 
efforts are limited, but rather may mean that 
the present methods used to control the ex- 
pression of hostility are inadequate. Further 
research into the relationship between idea- 
tional and behavioral expression of hostile 
impulses is indicated. 


Received August 17, 1955. 
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The Attitudes of the Mothers of Male Catatonic and 
Paranoid Schizophrenics toward Child Behavior’ 


Arnold P. Goldstein and Arthur C. Carr 


Creedmoor State Hospital 


The study was designed to determine if the 
attitudes of mothers of catatonic schizophren- 
ics to child behavior differ from the attitudes 
of the mothers of paranoid schizophrenics, 
testing the hypothesis derived from the re- 
cent suggestion of Arieti (1) that catatonics 
are persons who as children were criticized 
for their actions, while paranoids were criti- 
cized for their intentions and for lying. 

A questionnaire of 56 items related to 
child-raising attitudes was composed, borrow- 
ing from statements which Mark (2) found 
highly differentiating between mothers of 
schizophrenics and nonschizophrenics. Items 
from his scale and newly composed items 
were selected in terms of their relevance for 
the hypothesis. 

The attitude scale was administered to 60 
mothers, 34 of whom had sons who were pres- 
ently diagnosed as catatonic and 26 who had 
sons diagnosed as paranoid. Only clear-cut 
cases of each disorder were chosen. The two 
groups’ responses to these items were com- 
pared, item by item, by means of chi square. 

Results of the statistical analysis offered 
no support for the hypothesis, as only 3 of 
the 56 statements differentiated the groups 
at the .05 level of confidence or better, a 
number not sufficiently in excess of that ex- 
pected purely on a chance basis. 


1An extended report of this study may be ob- 
tained without charge from Arthur C. Carr, Creed- 
moor State Hospital, Queens Village, New York, or 
for a fee from the American Documentation Insti- 
tute. Order Document No. 4814 from ADI Auxiliary 
Publications Project, Photoduplication Service, Li- 
brary of Congress, Washington 25, D. C., remitting 
in advance $1.25 for microfilm or $1.25 for photo- 
copies. Make checks payable to Chief, Photodupli- 
cation Service, Library of Congress. 


An unexpected finding, however, appeared 
to offer some support for the contention that 
catatonic schizophrenics’ mothers may have a 
conflict over “choice of action” to a degree 
not found in the mothers of patients who were 
paranoid. Of the mothers solicited, mothers 
of catatonics with a significantly greater fre- 
quency reported themselves unable to com- 
plete the questionnaire, and refused to do so. 
Of the mothers cooperating in the study, re- 
fusal to answer individual items as reflected 
in the omission of answers occurred 46 times 
in the responses of the catatonics’ mothers as 
opposed to only 10 times in the responses of 
the mothers of the paranoid patients. 

The limitations of such evidence as offered 
by studies comparable to this cannot be de- 
nied in view of their failure to rule out the 
differential effects of the illness itself, as well 
as numerous other more complicated relation- 
ships which are difficult if not impossible to 
control. Evidence related to the etiology of 
schizophrenia, however, is at the present time 
totally so far from being unequivocal that it 
would appear that one can only maintain an 
open mind, accepting and evaluating accumu- 
lating evidence in the light of its limitations, 
until such time as the definitive, crucial evi- 
dence is available. 


Brief Report. 
Received January 19, 1956. 
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The Effectiveness of Psychotherapy with Individuals 
Who Have Severe Homosexual Problems 


Albert Ellis 
New York, New York 


Although early writers on homosexuality 
stressed its inherited character and minimized 
its treatability, contemporary investigators— 
including Allen (1, 2), Cory (3), Creadick 
(4), Ellis (5, 6), Anna Freud (7), Henry 
(8, 9), Laidlaw (12), Lewinsky (13), Lon- 
don and Caprio (14), Nedoma (15), Poe 
(16), Srnec and Freund (17), and Westwood 
(18)—tend to view homosexuals as suitable, 
if difficult, subjects for psychotherapy. Un- 
fortunately, however, the recent literature 
contains no account of the treatment of a 
sizable number of cases, with an attempt to 
deal statistically with some of the important 
variables concerned in such treatment. 

The present paper reports on 40 individu- 
als (28 males and 12 females) who were seen 
for five or more psychotherapeutic sessions in 
the author’s private practice during the years 
1951 to 1955. Thirty-six of these patients 
were having overt homosexual relations; while 
four (2 males and 2 females) were not overt 
homosexuals but were obsessed with homo- 
sexual thoughts and were afraid that they 
soon would become inverts. 

In seeing these individuals with severe 
homosexual problems, an active form of psy- 
choanalytically-oriented psychotherapy was 
employed, and one of the main therapeutic 
goals was to help the patient overcome his 
fear of heterosexual relations and, through 
improved sex-love relations with members of 
the other sex, to minimize his homosexual in- 
terests and activities. The therapeutic goal 
was not that of inducing the patient to forego 
all homosexual interests because, as the writer 
has pointed out previously (5, 6), that would 
be unrealistic. The neurotic element in homo- 


sexuality is not the homosexual activity or 
desire itself, since man is biologically a bi- 
sexual or plurisexual animal who, to some de- 
gree, may be considered rare or abnormal if 
he has absolutely mo homoerotic desires or 
participations during his entire lifetime. The 
abnormality in homosexuality consists of the 
exclusiveness, the fear, the fetishistic fixation, 
or the obsessive-compulsiveness which is so 
often its concomitant. The aim of psycho- 
therapy, therefore, should be to remove these 
elements: to free the confirmed homosexual 
of his underlying fear of or antagonism to- 
ward heterosexual relations, and to enable 
him to have satisfying sex-love involvements 
with members of the other sex. 

The 40 individuals surveyed in this paper, 
then, were treated for their homosexual prob- 
lems or neurosis rather than for their homo- 
sexual desire or activity per se. They were 
deemed to be distinctly or considerably im- 
proved when, during the course of psycho- 
therapy, they began to lose their fears of the 
other sex, to enjoy heteroamative relations, 
to be effective partners in these relationships, 
and to lose their obsessive thoughts about or 
compulsive actions concerning homosexuality. 
The patients seen were largely young people: 
18 being under 25 years of age; 19 between 
26 and 35; and 3 over 36 years. Thirty-one 
of the patients were single; 5 married; 4 di- 
vorced or separated. Twenty-eight were mod- 
erately or distinctly emotionally disturbed; 
12 were very severely emotionally disturbed. 
One had a grade-school education; ten were 
on the high-school level; 23 had some col- 
lege training; and 6 had done graduate work. 
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Results 


The data gathered by treating 28 males 
and 12 females with serious homosexual prob- 
lems with active psychoanalytically-oriented 
psychotherapy are shown in Table 1. From 
this table some interesting sex differences may 
first be noted. The female patients were more 
improved as a result of treatment, more often 
married or divorced, and more often evalu- 
ated as very severely emotionally disturbed. 
When tested by chi-square analysis, these dif- 
ferences all proved to be significant at the 
.OS level of confidence. There were also non- 
significant tendencies for the female patients, 
when compared to the males, to be less edu- 
cationally advanced, to have had more hetero- 
sexual activity, and to be more desirous of 
making a better heterosexual adjustment. 

The only other relationship in Table 1 that 
proved to be significant when tested by chi- 
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square analysis was that between the pa- 
tients’ desires to achieve heterosexual adjust- 
ment and their benefiting during the period 
of psychotherapy. Thus, of the 10 individu- 
als who showed little or no improvement, 5 
displayed little or no desire to achieve hetero- 
sexual adjustment; and of the 30 who showed 
distinct or considerable improvement, 2 dis- 
played little or no desire to achieve hetero- 
sexual adjustment. Using Yates’s correction 
for small cell frequencies, this difference 
proves significant at the .01 level of confi- 
dence. 

Relationship differences which are suggested 
by the data of Table 1 but which do not 
prove to be statistically significant include 
these: (a) The better-educated males seemed 
to improve more while being treated. (0) 
The less-disturbed males were more improved, 
but the more-disturbed females were more 


Table 1 


Relationships Between Benefits Received with Psychotherapy by 28 Male and 12 Female 


Homosexual Patients and Selected Other Variables 











Number of patients benefited with psychotherapy 





Little or no 











improvement 
Variable M F 
Age 
Under 28 4 
28 and over 6 0 
Marital status 
Single 9 0 
Married, divorced, or separated 1 0 
Education 
Grade or high school 4 0 
College or graduate school 6 0 
Clinical evaluation 
Moderately or distinctly 
emotionally disturbed 6 0 
Severely emotionally disturbed 4 0 
Extent of heterosexual activity 
before therapy 
Little or none 8 0 
Moderate or considerable 2 0 
Desire to achieve heterosexual 
adjustment 
Little or none 5 0 
Moderate 5 0 
Considerable 0 0 
Length of treatment 
5 to 19 sessions 8 0 
20 to 220 sessions 2 0 
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improved with treatment. (c) Individuals 
who had engaged in little or no heterosexual 
activity prior to treatment were more improved 
during psychotherapy. (d) The greater the 
length of the treatment, the more was the 
tendency of the patients to improve in the 
course of it. 

On the whole, the results—or at least the 
concomitants—of employing psychotherapy 
with individuals having severe homosexual 
problems seemed to be favorable. Of the 28 
male patients, 10 (36 per cent) appeared to 
be little or not at all improved; 7 (25 per 
cent) distinctly improved; and 11 (39 per 
cent) considerably improved. Of the 12 fe- 
male patients, 4 (33% per cent) appeared to 
be distinctly improved and 8 (67% per cent) 
considerably improved. Of the 20 male and 
female patients who entered therapy with 
little or moderate desire to overcome their 
homosexual problems (but who came, in- 
stead, mainly to work on other problems or 
to relieve their guilt over being homosexual), 
10 (50 per cent) achieved some improvement 
and 3 (15 per cent) achieved considerable 
improvement in their heterosexual relations. 
Of the 20 patients who came to therapy with 
a serious desire to overcome their homo- 
sexual problems, all made some improvement 
and 16 (80 per cent) made considerable im- 
provement in their sex-love relations with 
members of the other sex. 


Discussion 


One must be duly sceptical of the results of 
this investigation, as with the results of most 
other studies of psychotherapy, because the 
criterion of improvement utilized is depend- 
ent on the subjective statements of the pa- 
tients and judgments of the investigator. 
Moreover, no control group of similar sub- 
jects who had mot been therapeutically seen 
was utilized. Hence, all that is known, at 
best, is that the patients of this study im- 
proved with, and not necessarily because of, 
psychotherapy. 

Assuming that the therapy did, to some 
extent, help these patients make a better 
heterosexual adjustment, it may be wondered 
why the females, in spite of their greater dis- 
turbance, improved significantly more often 
than did the males. There may be several pos- 
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sible reasons for this difference: (a) Through 
relating to a male therapist, they may have 
more easily overcome their fear of and/or 
antagonism to males. (b) Because they can 
consummate coital relations with little diffi- 
culty once they decide to risk doing so, fe- 
males may be more willing than males to 
make a sincere effort to engage in sex-love 
involvements with the other sex. (c) Since 
the females in the present sample had had 
more heterosexual activity than had the males 
at the time they entered psychotherapy, they 
may have found it easier to give themselves 
more wholeheartedly to heteroerotic relation- 
ships. (d) The females studied were more de- 
sirous of achieving better adjustments with 
members of the other sex than were the males 
—and this desirousness was found, in the 
present study, to be significantly correlated 
with clinical improvement. 

Two interrelated questions may be asked: 
(a) Why were so many more homosexual 
males than females seen as patients? (b) Of 
the females who were seen, why did they tend 
to be more disturbed than were the males? 

The answer to the first question probably 
is that, in our society, there are many more 
confirmed homosexual males than females— 
a point on which Cory (3), Kinsey and his 
associates (10, 11), and other authorities 
seem to agree. This may be because the sexu- 
ally (and generally) inadequate male in our 
culture will tend to be more rejected by 
women than the inadequate female will be 
rejected by men. Women, moreover, tend to 
seek romantic and marital heterosexual at- 
tachments in spite of their lack of physical 
satisfaction, or even in spite of their fear of 
or hostility toward men. Where many inade- 
quate males, therefore, will resort to homo- 
sexual activity, many equally inadequate fe- 
males will take refuge in frigidity. 

If this is true, it may be predicted that 
women who do become Lesbians will tend to 
be more emotionally disturbed, on the whole, 
than male homosexuals. This is exactly what 
has been found in the present study. At the 
same time, it should be noted that 29 per 
cent of the males and 67 per cent of the fe- 
males studied were found to be very severely 
emotionally disturbed—a quite high percent- 
age in both instances. This is probably be- 
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cause confirmed homosexuals are, on the 
whole, exceptionally disturbed individuals— 
which one would expect to be the case with 
persons who overtly surrender such an im- 
portant characteristic as their fundamental 
sex role and who, in addition, are usually 
severely frowned upon and persecuted by the 
rest of their society. 

The obtained significant relationship be- 
tween the patients’ improvement in hetero- 
sexual relationships and their desire for such 
improvement seems understandable enough. 
This is often true for all kinds of disturbed 
individuals: those who really want to get 
better, and who will do the hard work re- 
quired in the course of active psychoanalyti- 
cally-oriented therapy, almost invariably im- 
prove considerably, and often in a relatively 
short length of time. Homosexuals, in particu- 
lar, who are essentially phobic in that they 
fear sex-love involvements with the other sex 
and take what seems to be the “easier” way 
out of homoeroticism, can, like most other 
phobics, overcome their neurosis if they will 
(a) acquire insight into the origin of their 
fears and (5) begin to participate in the ac- 
tivity they fear. If, with the help of good 
motivation, they will permit a competent 
therapist to help, persuade, cajole, and goad 
them into acquiring this insight and doing 
what they illogically fear, they will usually 
succeed in overcoming their homosexual neu- 
rosis. 

Assuming that the psychotherapy received 
by the individuals studied actually helped 
them in overcoming their severe homosexual 
problems, the writer’s impression is that the 
following therapeutic techniques were impor- 
tant in this respect: (a) The therapist was 
quite accepting and noncritical in relation to 
the patients’ homosexual desires and acts in 
themselves, but at the same time insistent on 
unmasking the neurotic motivations behind 
exclusive, fetishistic, and obsessive-compulsive 
homosexuality. (6) The therapist did not in- 
sist that the patients overcome ail their homo- 
sexual tendencies, but accepted many of these 
tendencies as normal or idiosyncratic. He em- 
phasized the patients’ becoming more hetero- 
sexual rather than less homosexual. (c) The 
therapist showed, by his manner and verbali- 
zations, that he himself was favorably preju- 
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diced toward heterosexual relationships. (d) 
Special attention was usually concentrated on 
the patients’ general antisexual attitudes and 
an active attack was made on his feelings of 
sexual guilt and shame. (e) In every instance 
there was as much focusing on the individu- 
al’s general feelings of inadequacy as on 
his sex problems. A major goal of therapy 
was always the achievement of general ego- 
strengthening, on the assumption that exclu- 
sive homosexuality often follows from, and is 
in turn the further cause of, severe feelings 
of worthlessness. (f) Wherever possible, the 
patients were persuaded to engage in sex-love 
relationships with members of the other sex 
and to keep reporting back to the therapist 
for specific discussion of and possible aid 
with these love relationships. 


Summary and Conclusions 


Twenty-eight male and twelve female in- 
dividuals with severe homosexual problems 
were seen for from 5 to 220 sessions of ac- 
tive psychoanalytic psychotherapy. In terms 
of their achieving satisfactory sex-love rela- 
tions with members of the other sex, it was 
found that 36 per cent of the male patients 
were little or not at all improved; 25 per cent 
distinctly improved; and 39 per cent consid- 
erably improved. Of the female patients, 3344 
per cent were distinctly improved; 6674 per 
cent considerably improved. 

The female patients were found to be sig- 
nificantly more improved while being treated, 
more often married or divorced, and more 
often evaluated as very severely disturbed in- 
dividuals than were the male patients. 

A highly significant relationship was found 
between both male and female patients’ de- 
sires to achieve heterosexual adjustment and 
their benefiting during the period of psycho- 
therapy. 

Although due caution must be maintained 
in generalizing from this study, it is felt that 
there are some grounds for believing that the 
majority of homosexuals who are seriously 
concerned about their condition and willing 
to work to improve it may, in the course of 
active psychoanalytically-oriented psychother- 
apy, be distinctly helped to achieve a more 
satisfactory heterosexual orientation. 


Received August 23, 1955. 
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Homosexuality and the Rorschach’ 


Carl J. Nitsche, J. Franklin Robinson, 


Children’s Service Center, Wilkes-Barre, Pennsylvania 


and Edward T. Parsons 


The Pennsylvania State University 


This study was an attempt to determine 
whether certain Rorschach content categories 
occur more often in protocols of homosexuals 
than in those of nonhomosexuals. 

The Ss, 38 white males referred to the 
Adult Mental Health Clinic, were divided into 
two groups: 19 recently convicted homosexu- 
als and 19 patients with no known homosexual 
activity. 

Twelve content categories were selected 
from studies by Due and Wright (1), Ulett 
(4), Chapman and Reese (3), and Fein (2): 
mythical distortion, qualifications toward the 
abnormal, dehumanization, double identifica- 
tion, uncertainty, evasiveness, projection of 
feminized behavior, preoccupation with femi- 
nine apparel, castration and phallic symbol- 
ism, sexual and anatomical responses, esoteric 
language and artistic references, dislike of fe- 
male genital symbols. 

The number of responses in each of these 
categories was determined for each S of the 
two groups and expressed as a percentage of 
the total number of responses of each S. This 


1An extended report may be obtained without 
charge from C. J. Nitsche, Children’s Service Cen- 
ter, 335 S. Franklin St., Wilkes-Barre, Pa., or for a 
fee from the American Documentation Institute. To 
obtain it from the latter, order Document No. 4811 
from ADI Auxiliary Publications Project, Photo- 
duplication Service, Library of Congress, Washing- 
ton 25, D. C., remitting in advance $1.25 for micro- 
film or $1.25 for photocopies. Make checks payable 
to Chief, Photoduplication Service, Library of Con- 
gress. 


was done for each of the 12 categories sepa- 
rately and for the 12 categories combined. 
The means, standard deviations, and stand- 
ard error of the difference between means 
were computed. 

The homosexual group had more responses 
in 9 of the 12 categories than did the non- 
homosexual group, but the differences were 
not statistically significant at the .05 level. 
Mean for the homosexual group was 4.2 re- 
sponses in the selected categories, and for the 
nonhomosexual group, 2.7. Of the selected 
categories, castration and phallic symbolism, 
and sexual and anatomical responses, occurred 
most often in both groups. The homosexual 
group had no double identification responses, 
and the nonhomosexual group, no double 
identification, evasiveness, or dislike of fe- 
male genital symbols. 


Brief Report. 
Received February 14, 1956. 
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Relationship of Sexual Adjustment to Certain Sexual 
Characteristics of Human Figure Drawings 


Carl N. Sipprelle and Clifford H. Swensen 


University of Tennessee 


Sexual aspects of human figure drawings 
are routinely used as indices of the sexual 
identification and adjustment of subjects. For 
example, Machover (4) states that “some de- 
gree of sexual inversion was contained in rec- 
ords of all individuals who drew the opposite 
sex first in response to the instruction, ‘draw 
a person’.”’ 

It was the object of this investigation to at- 
tempt to determine what relationship existed 
between the sexual adjustment of psycho- 
therapy patients and some of the sexual as- 
pects of their human figure drawings. 


Procedure 


The subjects (Ss) consisted of 25 men and 
24 women in psychotherapy at the University 
of Tennessee Psychological Service Center. 
Two of these Ss were seen only for diagnostic 
testing and interview, 1 was seen for 4 ther- 
apy interviews, 11 were seen for from 8 to 
20 therapy interviews, and the remaining 35 
Ss were seen for more than 20 interviews. 
Twenty-four of the Ss were seen in over 50 
interviews. Three of the Ss were in therapy 
with graduate students in the last year of 
clinical training, and the remainder were seen 
by Ph.D. clinical psychologists with three or 
more years of experience in psychotherapy. 

All Ss were routinely given the Draw-A- 
Person Test when they applied to the Service 
Center for treatment. The sex of the first- 
drawn figure was noted and the drawings 
were rated on sexual differentiation using a 
scale described in a previous paper (6). The 
male figures for the men and women were 
ranked for masculinity and the female figures 
were ranked for femininity, by two independ- 
ent judges, and the rankings for the two 
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judges were averaged. The reliability of rank- 
ings was determined by correlating the rank- 
ings of Judge 1 with the rankings of Judge 2. 
The correlations for the four rankings ranged 
from .50 to .84, with the average falling at 
.63. Although the lowest reliability coefficient 
(.50) is not particularly impressive, it was 
significant at the .01 level. 

The therapists rated their patients on a 
“sexual adjustment rating scale.” This scale 
consisted of eight subscales, with five points 
on each subscale. The subscales were: (a) re- 
ported frequency of intercourse; (4) social 
behavior with opposite sex at dances, dates, 
parties, etc.; (c) attitude toward sexual 
partner; (d) guilt over techniques for sexual 
expression; (¢€) homosexual vs. heterosexual 
orientation; (f) attitude toward opposite sex; 
(g) symptoms relating directly to sex; (/) 
discussion of sex in interview. Each point on 
the subscale was given a numerical value, 
with 1 being “good” adjustment and 5 being 
“poor” adjustment. The ratings for the sub- 
scales were summed to obtain a total score 
on sexual adjustment. The arrangement of 
the subscales was randomized in order to 
minimize the possibility of halo effect. The 
raters were asked to rerate each of the Ss two 
months after the original ratings were made 
in order to check the reliability of the rat- 
ings. The reliability coefficients were all sig- 
nificant at well over the .01 level, and aver- 
aged .78. 

The relationships obtaining between the 
sexual adjustment scales and the drawings 
were determined by the chi-square technique 
for each sex group. That is the men Ss (N 
= 25) made up one group and the women Ss 
(N = 24) made up a separate group. 
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Table 1 
Relationships Among Sexual Adjustment Scales and Human Figure Drawings 



































Sexadjustment Sex first- Sexual Masculinity Femininity 
drawn figure differentiation male figure female figure 
Scales M Ww M W M W M W 
A 3.23 2.94 71 2.77 65 .68 65 .68 
B 81 1.08 .70 75 2.61 3.00 00 00 
© 81 4.00* 70 76 65 .00 65 00 
D 80 .00 2.80 00 00 2.71 65 2.71 
E .90 89 83 2.71 00 .68 00 2.71 
F 3.23 89 00 .68 .69 .68 71 .68 
G 84 .00 3.07 2.95 00 72 2.80 2.87 
H 3.23 88 .00 2.71 65 .68 65 .68 
Total 81 00 .00 75 .00 3.00 .00 3.90* 





* Significant at .05 level. 


Results and Discussion 


Of 72 computed statistics, only two were 
significant at the .05 level. These results are 
contained in Table 1. They show that women 
who tend to draw the female figure first also 
tend to be dissatisfied with their sexual part- 
ners and that the women who draw the more 
feminine female figure tend to be better ad- 
justed sexually, as measured by the total 
score on the sexual adjustment scale. How- 
ever, such a small number of significant sta- 
tistics out of such a large number of com- 
puted statistics are undoubtedly due to 
chance. It would appear to the authors that 
these results are in agreement with those of 
Fisher and Fisher (2). Although they con- 
cluded that women who draw female figures 
of average femininity tend to adjust better 
sexually than women who draw figures that 
are extremely feminine or masculine, it will 
be noted that out of the Fishers’ 48 com- 
puted statistics only eight were significant at 
the .05 level. 

It will be noted that there was no signifi- 
cant relationship between the sex of the first 
drawn person and sexual adjustment. These 
findings are in agreement with those of other 
investigators (1, 3, 5). 

These rather disappointing findings with 
the DAP suggest that the DAP is either an 
insensitive instr nent or an unreliable one 





which is useful only as a gross indicator of 
psychodynamics. 


Summary 


Forty-nine psychotherapy patients were 
rated by their therapists on sexual adjust- 
ment. Several sexual aspects of the patient’s 
human figure drawings were rated. It was 
found that there was no significant relation- 
ship between the patient’s sexual adjustment 
and the sexual characteristics of their human 
figure drawings. 


Received August 29, 1955. 
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Psychological Concomitants to Rate of Recovery 
from Tuberculosis’ 


Louis J. Moran,’ George W. Fairweather, Seymour Fisher, 
and Robert B. Morton 
VA Hospital, Houston, Texas 


The presence of psychological determinants 
in the course of recovery from tuberculosis ap- 
pears to have gained wide acceptance among 
specialists in the treatment of tuberculosis. It 
is common to find in the extensive medical 
literature on the subject statements such as 
Wittkower’s that, “sometimes it may be safer 
to assess the patient’s prognosis on the basis 
of his personality and his emotional conflicts 
than on the basis of the shadow on his film” 
(4, p. 202). 

Only in recent years, however, have at- 
tempts been made to furnish experimental 
evidence for this conviction, which is based 
almost exclusively on clinical observation. In 
the main, these studies take two forms. One 
type of study draws upon a comprehensive 
and intensive study of selected patients with 
the purpose of abstracting personality dy- 
namics to account for differences in rate of 
recovery. Representative of such studies are 
those of Benjamin, Coleman, and Hornbein 
(1), and Wittkower, Durost, and Laing (4). 
The first investigators emphasize the status of 
hostility and the second investigators empha- 
size the nature of defenses against dependency 
as determinants in rate of recovery from tu- 
berculosis. 

A second type of study seeks to relate test 
scores to specific criterion groups of fast and 
slow recoverers. Typical of these are Cohen’s 
(3) study, which used a wide variety of Ror- 
schach scoring variables, and Brotman’s (2) 


1 The authors would like to acknowledge the gen- 
erous support and guidance of Drs. Daniel E. Jen- 
kins, Hollis G. Boren, and Irving Chofnas, tubercu- 
losis specialists, in the conduct of this study. 

* Now at the University of Texas. 
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use of various combinations of MMPI scores. 
The results of these studies have been essen- 
tially negative. 

The present study is designed to examine 
the relationship between the rate of recovery 
of tuberculosis patients and four psychologi- 
cal variables: (a) behavior on the ward, (5) 
attitudes toward the hospital environment, 
(c) past behavior before hospitalization, and 
(d) fantasy as measured by the Thematic 
Apperception Test (TAT). 


Method 


The length of time required for the patient 
to convert from positive to negative bacteri- 
ology was selected as the criterion for rate of 
recovery. Routine laboratory reports are made 
on the bacteriological status of the patient’s 
sputum or his gastric contents. A negative re- 
port indicates the absence of tubercle bacilli 
in the laboratory specimen. For purposes of 
this study, after five successive negative labo- 
ratory reports the patient is considered to be 
“converted,” i.e., from positive to negative, 
bacteriologically. 

The 46 patients studied were also subjects 
in a national research on the effectiveness of 
chemotherapy. For purposes of this latter 
project, all subjects were excluded who pre- 
sented complications for an evaluation of 
chemotherapy, such as previous drug treat- 
ment, certain other diseases, etc. Also, sub- 
jects whose physical condition prohibited the 
random assignment of certain drugs were ex- 
cluded. At the follow-up four months after 
admission virtually no patients had con- 
verted; at the five- to eight-month follow-up, 
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over two-thirds had converted; and at the 
nine- to twelve-month follow-up, virtually all 
patients had converted. The highly selected 
nature of the present sample, as well as the 
potent effect of chemotherapy on the criterion 
should be carefully considered in generalizing 
the results of this study to the general popu- 
lation of tuberculous patients. 

The 32 patients that converted within five 
to eight months were designated the fast re- 
covery group. The 14 patients that did not 
convert within five to eight months were 
designated the slow recovery group. There 
was no Statistically significant difference be- 
tween the fast recovery group and the slow 
recovery group in age, race, previous bed- 
rest treatment, or stage of illness (i.e., mini- 
mal, moderate, far advanced). All subjects 
received chemotherapy during the eight- 
month period under study. 

Measurement of behavior in the hospital.* 
The ward behavior of each patient was rated 
on a 64-item rating scale independently by 
two aides and two nurses who had usually 
attended the patient for several months prior 
to the rating. Items representing three classes 
of behavior were selected on a rational basis 
from the pool of items: (a) 14 items con- 
cerning conformity to regulations, e.g., stay- 
ing in bed, wearing a mask, covering coughs, 
(6) 22 items concerning relations with per- 
sonnel, e.g., forms of demanding, complain- 
ing, conviviality, and (c) 7 items concerning 
relations with other patients, e.g., hazing, “re- 
porting” a patient, disturbing others. 

Each item offered two alternatives, one of 
which was judged on an a priori basis to be 
adaptive. The score for a specific item was 
the number of times the adaptive alternative 
was checked. For example, if all four raters 
checked the adaptive alternative the score 
was four; if two raters checked the adaptive 
alternative the score was two. The patient’s 


3 A complete set of the measuring instruments (ex- 
cept for TAT), with scoring instructions, has been 
deposited with the American Documentation Institute. 
Order Document No. 4812 from the ADI Auxiliary 
Publications Project, Photoduplication Service, Li- 
brary of Congress, Washington 25, D. C., remitting 
in advance $1.25 for photoprints or $1.25 for micro- 
film. Make checks payable to Chief, Photoduplica- 
tion Service, Library of Congress. 





L. J. Moran, G. W. Fairweather, S. Fisher, and R. B. Morton 


score on any subscale was the sum of the 
scores for the items comprising that subscale. 

Measurement of attitudes toward the hos- 
pital environment. The attitude scale con- 
sisted of items representing three classes of 
attitudes, selected on a rational basis from a 
pool of 45 items: (a) 10 items concerning 
ward regulations, (4) 14 items concerning 
personnel, and (c) 4 items concerning other 
patients. Items were couched in terms of 
statements, e.g., “If I had a little more up- 
time I would get well just as quickly.” The 
statement was read to the patient, who was 
instructed to respond in one of the following 
four ways: strongly agree, moderately agree, 
strongly disagree, moderately disagree. 

In the construction of the scale, an a priori 
judgment was made as to whether agreement 
or disagreement with a particular item could 
be considered least or most adaptive. The re- 
sponse of the patient to each item was scored 
0, 1, 2, or 3, depending on the intensity of 
the attitude and its adaptive directionality. 
Thus, an item considered adaptive in the di- 
rection of agreement received the following 
scores: strongly disagree = 0, moderately dis- 
agree = 1, moderately agree = 2, strongly 
agree = 3. The score for any subscale was 
the sum of the scores for the items compris- 
ing that subscale. 

Measurement of prehospital life adjustment. 
This information was obtained by means of 
structured interviews. The total interview 
schedule concerning past life adjustment con- 
sisted of 132 items. From this pool of items, 
three rational subscales were constructed: 
(a) 7 items concerning response to institu- 
tional control, e.g., school attendance, arrests, 
military disciplinary actions, (6) 9 items 
concerning general interpersonal relations, 
e.g., fights, number of close friends, preferred 
activities, and (c) 12 items concerning past 
life achievement, e.g., school achievement, or- 
ganizational leadership, gain in military rank, 
gain in occupational level. 

Each possible response to an item was 
ranked on an a priori basis from least to 
most adaptive. The score for each possible 
response to an item was its rank position, 
with the least adaptive response receiving a 
score of zero. Thus, the response to an item 
with two response possibilities received a 
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score of zero or one; an item with three re- 
sponse possibilities received scores of zero, 
one, or two, and so on through the items. 

To prevent items with the greatest number 
of response possibilities from unduly influ- 
encing the total scale score, responses to each 
item were rescored on a scale from zero to 
nine. Thus, an item with two responses di- 
vided the zero-to-nine scale into three equal 
units and the two alternatives received scores 
of 3 and 6. A three-response item divided the 
zero-to-nine scale into four equal units and 
the three alternatives received scores of 2.3, 
4.5, and 6.7, respectively. All items on the 
past life adjustment scale were so rescored. 
Any subscale score was the sum of the items 
comprising that scale. 

Measurement of fantasy productions. Stories 
to ten TAT pictures were obtained from 38 of 
the 46 patients. The test was administered in- 
dividually, with cards in the following se- 
quence: 1, 2, 4, 12M, 8BM, 6BM, 17BM, 
18BM, 6GF, 7BM. These protocols were 
scored by Seymour Fisher in a blind manner, 
ie., without knowledge of the individual pa- 
tients, of the criterion group into which any 
patient fell, or of the total number of cases 
in either of the criterion groups. Each proto- 
col was scored in the following four ways. 


1. Extent of hostile fantasy was judged by the 
number of instances in which a person was described 
as intentionally injuring, harming, deceiving, block- 
ing or inflicting any kind of damage to another per- 
son. Hostility was scored even when the act was de- 
scribed as justified or as being carried out by an 
official agent, eg., policeman. Hostility was not 
scored when the damaging act was “accidental” or 
due to abstract forces of nature. Interjudge rank- 
order reliability, based on ten independently scored 
protocols was .92. 

2. Vague or weak vs. clear or strong depiction of 
parental figures was judged on the basis of stories 
to cards 2, 6BM, and 7BM. Parental figures were 
scored as clear or strong if the story described either 
parent in a clearly domineering or unfriendly role or 
in a clearly favorable or friendly role. Parental fig- 
ures were scored as vague or weak if the parents 
were described as inadequate or if the story data 
were too fragmentary or unclear to permit classifi- 
cation in the clear or strong category. An interjudge 
reliability of .81 on the dichotomous scores was ob- 
tained on twenty independently scored protocols. 

3. Aspiration fantasies were judged by the num- 
ber of instances in which a person was described as 
having high aspirations, engaging in unusually hard 
work, attaining financial success, doing an excellent 
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job, being obligated to great effort, or attaining any 
laudable aim. The judgment of aspiration fantasy 
included statements of determination, special pur- 
pose, and the adoption of a purposeful, challenging 
attitude toward the world. An interscorer rank- 
order reliability of .62 was obtained on ten inde- 
pendently scored protocols. 

4. Fantasies of inactivity were judged by the in- 
cidence of all explicit references to sleeping, day- 
dreaming or dreaming, passively giving up, being in 
a dazed state of consciousness, being pitifully inade- 
quate to a task, or being dreamily preoccupied with 
something distant from the present situation. Lack 
of inactive fantasy received the highest score. Rank- 
order reliability of scoring, based on ten independ- 
ently scored protocols was .79. 


Statistical treatment. It was hypothesized 
that rapid improvement should be correlated 
positively with high scores on each of the 
psychological variables. This hypothesis was 
tested by biserial correlation except for the 
fantasy variables which, because of dichoto- 
mous scoring, were tested by tetrachoric cor- 
relation. 

The variables were then intercorrelated and 
a smaller matrix derived by means of a clus- 
ter analysis of the total matrix. This derived 
matrix provides the cluster of variables that 
have a positive correlation with the criterion 
variable and which also are all positively cor- 
related with each other. 

It is necessary at this point to anticipate 
the results in order to account for the pres- 
ence of an additional variable. The discovery 
of a significant relationship between the TAT 
story depiction of vague or weak parental fig- 
ures and the patient’s fast recovery led to 
speculation about the absence of parental fig- 
ures in the early life of such patients. This 
additional post hoc variable, complete ab- 
sence (death, divorce, or separation) of one 
or both parents before the patient was ten 
year old, was therefore included. 


Results 


In Table 1 is shown the correlation of each 
variable with the criterion of fast recovery. 
The probability that this distribution of cor- 
relations could have occurred by chance is 
p < .01. As indicated in Table 1, conformity 
to hospital regulations and good relations 
with personnel were significantly correlated 
with fast recovery, .38 and .55, respectively. 
Good relations with other patients also 
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Table 1 


Correlation Between Fast Recovery and Selected 
Psychological Variables 








Psychological variables Correlation 











Behavior toward regulations .38** 
Behavior toward personnel a 
Behavior toward patients .29 
Attitude toward regulations —.11 
Attitude toward personnel 10 
Attitude toward patients — .02 
Past behavior toward control 01 
Past behavior toward peers 14 
Past behavioral achievement —.18 
Fantasy hostility 44° 
Fantasy clear parent —.61 
Fantasy aspirations 25 
Fantasy (lack of) inactivity 35 
Loss of real parent = 





*p> < .05 thatr s .00. 
> < 01 that r Ss .00. 
> < .001 thatr Ss .00. 


showed a correlation of .29 with rapid im- 
provement, which statistically was not sig- 
nificantly greater than zero. 

Neither the attitude measures nor the 
measures of past behavior (prehospital) was 
significantly correlated with fast recovery. 

The measure of fantasy hostility correlated 
44, p< .05, with fast recovery. High fan- 
tasy aspiration and the relative absence of 
fantasies of inactivity were correlated .25 and 
.35 fast recovery, but neither correlation was 
significantly greater than zero. The correla- 
tion of — .61 between fantasy-clear-parent 
and fast recovery was in the direction oppo- 
site that predicted. This finding led to the 
inclusion of a post hoc “hunch” variable, 
complete absence of one or both parents be- 
fore the patient was ten years old. This loss- 
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of-parent variable was significantly corre- 
lated .58 with fast recovery. 

Table 2 presents an empirically derived 
cluster of those variables that were positively 
correlated with fast recovery and were also 
correlated positively with each other. For con- 
venience of the reader, fantasy-clear-parent 
in Table 1 has been changed to fantasy- 
vague-parent in Table 2, and the signs re- 
versed. 


Discussion 


When the results of this study are viewed 
collectively, it would appear that variability 
in rate of recovery from tuberculosis is clearly 
associated with observed adaptive behavior 
on the ward, associated to some degree with 
measures of fantasy, and unrelated to ex- 
pressed attitudes or to prehospital behavior, 
as measured. 

The cluster of positive intercorrelations in 
Table 2 furnishes what might be considered 
an empirically derived “syndrome,” charac- 
terizing those patients in the present sample 
who recovered most rapidly. A central char- 
acteristic of this syndrome might be labeled 
“docile amiability.” The extraordinary re- 
strictions imposed as conditions of treatment 
are obligingly respected; interpersonal rela- 
tions with the personnel who administer these 
restrictions are generally very cordial; and 
there is some indication that relations with 
other patients are better than average. Hos- 
tility, when it appears, is more likely to be 
evident in fantasy material than in the overt 
behavior of the patient. 

The last two variables in the cluster en- 
courage speculation about the etiological sig- 


Table 2 


Derived Matrix of Positively Intercorrelated Variables 

















Variables 1 

1. Fast recovery 

2. Behavior toward regulations 38* 
3. Behavior toward personnel 55° 
4. Behavior toward peers .29 
5. Fantasy hostility 44 
6. Fantasy vague parent .61* 
7. Loss of real parent 58* 





2 3 4 5 6 7 
A9** 

.66** .63** 

17 04 OS 

13 22 O01 19 

04 22 48* BY ag 35 





*o < OS thatr = 00. 
*> < 01 thatr = .00. 
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nificance of parental identification in the de- 
velopment of “docile amiability.” Both the 
lack of clear representation of parental fig- 
ures in the TAT stories and the absence of 
one or both parents during the patient’s child- 
hood are suggestive of limited identification 
with parental figures. 

The reader is free to place any interpreta- 
tion he chooses on these strictly empirical 
data. We entertain the hypothesis that it is 
the strongly identified American male that is 
likely to show slower recovery from tubercu- 
losis. He is the one most likely to obstruct 
treatment by rebelling actively against the 
rigidly enforced dependency which is a con- 
dition of treatment for tuberculosis. This hy- 
pothesis receives some support from Witt- 
kower’s finding that the tuberculosis cases 
with poorest prognosis were those whose 
“lives were governed by conscious and un- 
conscious attempts to emulate, please, or 
placate their ambitious, driving and, hence, 
ambivalently regarded parents whose image 
had become an integral part of their person- 
ality structure. . Such persons take badly 
to hospitalization. . . . Their pent-up ten- 
sion may be discharged in general restless- 
ness and/or undesirable activities at the hos- 
pital” (4, p. 211). 


Summary 


Rate of recovery from tuberculosis was 
estimated for 46 patients by the amount of 
time required for bacteriological conversion 
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to occur. This criterion was then correlated 
with four classes of psychological measures: 
(a) behavior on the ward, (5) attitudes to- 
ward the hospital environment, (c) life his- 
tory, and (d) responses to ten TAT cards. 

Variability in rate of recovery from tu- 
berculosis was found to be associated with 
adaptive behavior on the ward, associated to 
some degree with measures of fantasy, and 
unrelated to expressed attitudes or to pre- 
hospital behavior, as measured. A cluster of 
positive correlations was then derived from 
the total matrix, and hypotheses offered con- 
cerning personality “syndromes” associated 
with fast recovery and slow recovery from 
tuberculosis. 


Received September 12, 1955. 
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Flexor-Extensor Movement on the Rorschach’ 


Mitchell Wetherhorn 


Cambridge (Minnesota) State Hospitai 


This study attempted to validate current 
interpretations of flexor and extensor move- 
ment responses on the Rorschach test. It has 
been suggested that extensor movement is a 
correlate of aggressive, masculine, dominance 
striving, and flexor movement is a correlate 
of passive, submissive, feminine striving. In 
order to test this hypothesis, a series of spe- 
cial movement plates was designed, consist- 
ing of two human figures in opposition. Spe- 
cial plates were needed to elicit large num- 
bers of movement responses since the average 
number of responses classifiable as flexor or 
extensor movement on the regulation Ror- 
schach protocol is small. 

The problem, stated negatively, was that 
flexor-extensor movement is not a measure 
of ascendancy-submission nor masculinity- 
femininity. 

The Ss were undergraduate college students 
enrolled in general psychology. The group 
comprised 47 females and 33 males. Each S 
was administered the Mf scale of the MMPI, 


1An extended report of this study may be ob- 
tained without charge from Mitchell Wetherhorn, 
Cambridge State Hospital, Cambridge, Minnesota, 
or for a fee from the American Documentation In- 
stitute. Order Document No. 4828 from ADI Aux- 
iliary Publications Project, Photoduplication Service, 
Library of Congress, Washington 25, D. C., remit- 
ting in advance $1.25 for microfilm or $1.25 for 
photocopies. Make checks payable to Chief, Photo- 
duplication Service, Library of Congress. 


the A-S Reaction Study, and the special move- 
ment plates. 

Female Ss produced significantly more flexor 
movement responses on the SMP than did 
the male Ss (¢ = 2.45; p= < .05). Males 
did not produce significantly more extensor 
responses than did the females (¢ = .36; p 
=n.s.). The female scores correlated as fol- 
lows: A-S and flexor movement — .06, A-S 
and extensor movement + .10, Mf and flexor 
movement — .07, and Mf and extensor move- 
ment + .02. Male scores correlated as fol- 
lows: A-S and flexor movement — .07, A-S 
and extensor movement — .19, Mf and flexor 
movement — .25, Mf and extensor movement 
+ .20. None of the correlations were signifi- 
cant. 

In conclusion, it would appear, a flexor- 
extensor movement continuum does not meas- 
ure the same dimensions of personality as the 
A-S Study or the Mf scale of the MMPI. 
There is need for further validation study 
of flexor-extensor movement interpretations. 
Perhaps a projective technique such as the 
SMP taps a deeper level of personality than 
questionnaires, hence the lack of higher cor- 
relations. Nevertheless, caution should be ob- 
served in utilizing flexor-extensor movement 
as a measure of the variables studied. 


Brief Report. 
Received February 17, 1956. 
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The Goal-Spurt Hypothesis and the Rorschach Test’ 


Louis J. Maradie 


Dade County Board of Public Instruction, Miami, Florida 


The possible relationship between color and 
stimulus objects and its effects upon percep- 
tion have been the subject of many investi- 
gations. In discussing the influence of color 
upon the subject’s perceptions to the stimulus 
cards, Roschach stated that, “There is a defi- 
nite correlation between the extent of motor 
activity and the number of responses influ- 
enced by color perception.” He apologetically 
added that, “The causes of this correlation 
remain to be discovered” (6). 

Confounding the problem is the fact that 
although there is little experimental evidence 
showing that the presence or absence of color 
in the stimulus figures is related to the num- 
ber or significance of responses, clinically we 
find that the color plates often do evoke sig- 
nificant responses indicative of the patient’s 
affective status. 

Specifically, this study is concerned with 
the classical assumption that the presence of 
color in the standard VIII, IX, and X Ror- 
schach plates stimulates normal subjects and 
is responsible for the subsequent increase in 
the number of responses to these cards. 
Klopfer (3) uses the ratio of responses to 
Cards VIII, IX, and X divided by the total 
number of responses to Cards I through X 
inclusive. According to him, underproduction 
is indicated when the obtained ratio is less 
than .30. Overproduction is indicated when 
the obtained ratio goes above .40 to .50. He 


1 Part of a dissertation presented to the Faculty of 
the Graduate School of the University of Kentucky 
in candidacy for the Degree of Doctor of Philoso- 
phy. The writer wishes to express his gratitude and 
appreciation to Dr. Richard L. Blanton, the director 
of this dissertation, whose suggestions and guidance 
were of inestimable value. Grateful acknowledgment 
is also made to Dr. Graham B. Dimmick and to Dr. 
Hans Hahn who made available some of the test 
materials. 


interprets these variables as 
either the “stimulating” 
fects of color. 

Lazarus was among the first to subject the 
problem of the stimulus value of color to ex- 
perimentation (4). In comparing his subjects’ 
performance on the chromatic and achromatic 
inkblots, he found that the absence of color 
had little effect on the scoring categories when 
the protocols were evaluated. Using the same 
technique, Sappenfeld and Buker (7) found 
no significant differences in the number of 
responses evoked by Cards VIII, IX, and X 
which were either chromatic or achromatic. 

This author (5) circumvented the artifacts 
which arise by retesting the same subjects 
under different conditions. Ten distinct sys- 
tematically randomized sequential orders of 
standard Rorschach plates in a latin-square 
design were individually administered to his 
subjects. He found that irrespective of the 
particular sequential order of the cards, the 
position of the cards in the series was of more 
importance with later-appearing cards evok- 
ing more responses than earlier-appearing 
cards. It was not definitely shown that color 
did not also have some effect, since this vari- 
able was not studied by the use of achromatic 
test plates. 

Although the above-mentioned studies (4, 
5, 7) have conclusively demonstrated that an 
increment in productivity is not inextricably 
related to the presence or absence of color, 
they have neither stimulated a refashioning 
of the classical assumptions nor have they of- 
fered acceptable explanations for the in- 
creased productivity phenomenon. 


functions of 
or “disturbing” ef- 


Procedure 


Eighty females, who were enrolled as stu- 
dents in two nurses’ training schools in Louis- 
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ville, Kentucky, volunteered to act as sub- 
jects (Ss). They ranged in age from 17 to 35 
years with a median chronological age of 18. 
All Ss were Rorschach-naive and were assured 
that the results of their performances would 
be kept confidential. They were individually 
tested within a two-month interval. 

Both standard and achromatic Rorschach 
plates were used in this study. The achromatic 
figures, identical in structure and size to the 
standard plates, were printed in black ink 
only. Verlag Hans Huber of Bern, Switzer- 
land, produced both series of figures, but the 
achromatic figures were mounted by the au- 
thor. 

The work tasks were three in number. 


The Digit Symbol task is a visual-motor associa- 
tion task which involves the substitution of three 
different numbers—2, 4, and 6—under the appro- 
priate symbol—%, #, and &. Ten 5%- by 8-inch 
cards, with 54 substitutions to be made on each, 
were presented to each of the Ss. No symbol fol- 
lowed itself and each symbol was present 18 times 
on each card. 

The Successive Addition task, a modification of 
Hahn’s KRHH Test, is a more complex visual-motor 
task. Ten 5- by 7-inch cards, each composed of 5 
columns of 11 one-digit numbers, were presented to 
each S. By successively adding each overlapping pair 
of numbers down each of the five columns, 50 sums 
could be calculated on each card. 

The third task, Color Naming, is a modification 
of Stroop’s Color-Word Test (9). Color Naming is 
a novel visual-verbal task requiring a shift from the 
usual reading of words to the naming of the color 
of the ink in which the word is printed. This stimu- 
lus material was composed of ten 6- by 8-inch cards. 
Each card contained five color names (Green, Red, 
Black, Blue, and Orange) printed incongruously in 
the five colors, that is, “Green” printed in blue, red, 
black, and orange ink, “Red” printed in green, black, 
blue, and orange ink. No word was printed in the 
color of ink which it designated. Each card con- 
tained 60 such words. No color followed itself in the 
same row; equal numbers of each of the colors were 
printed on each card. 


Every card in each of the three work tasks 
was unique, thus minimizing the possibility of 
learning from trial to trial. Each card of each 
work task had an equal number of operations 
requiring approximately a total time of one 
minute per card to complete. 

With demonstration cards identical to the 
actual cards of each of the three work-task 
cards, practice periods were given to each S 
in order to minimize any learning effects. One 


minute of practice on both the Color-Naming 
and Successive Additions tasks was given. 
Because of the association learning involved 
on the Digit-Symbol task, two minutes of 
practice were allowed. 

In order to determine the effect of orienta- 
tion toward the goal on productivity, as meas- 
ured by responses on the Rorschach Test and 
time scores on the work tasks, two major 
groups were designated. Half of the Ss com- 
posed the “Goal-Oriented” group and the re- 
mainder, the “Goal-Disguised” group. 


Those in the Goal-Oriented group received instruc- 
tions which indicated the actual*number of opera- 
tions (Rorschach cards or work-task cards) which 
they would be called upon to perform. The Ror- 
schach test instructions to these Ss were those sug- 
gested by Beck (1) in which the number of cards 
the S will receive is clearly specified. They were ad- 
monished not to make any mistakes on the work 
tasks by attempting to work rapidly. It was em- 
phasized that they should work at their own normal 
and comfortable rate of speed. 

To further insure a set for goal orientation, the 
stacks of ten Rorschach cards and the stacks of ten 
cards of each of the work tasks, appropriately num- 
bered on the reverse side of each, were visible to the 
S and kept before him during the entire testing 
period. 

The Goal-Disguised group was subjected to a 
more trying procedure in that its work task and 
Rorschach test instructions were completely unstruc- 
tured as to orientation toward the goal. The Ror- 
schach test instructions to these Ss were also those 
suggested by Beck except for the substitution of the 
word series in place of the word ten. These Ss did 
not know the number of cards which would be pre- 
sented. The admonition concerning errors and rate 
of speed was identical to that given the Goal-Ori- 
ented group. 

Half of the Ss in this group were confronted with 
a stack of twenty Rorschach cards placed before 
them. Nothing was communicated to imply that they 
would not be expected to respond to them all. In 
the case of the remaining Ss, the ten Rorschach 
plates were hidden from view under a box at the 
examiner’s side and were presented one at a time. 

The two disguise techniques were also carried out 
with alternate work tasks. With approximately half 
the Ss, two of the three work tasks were hidden and 
the third disguised in the stack of twenty. The other 
Ss were presented two of the work tasks disguised 
in the stack of twenty and the third task hidden at 
the examiner’s side. 


In both the Goal-Oriented and Goal-Dis- 
guised groups, half of the Ss in each group 
received the standard plates and half the 
achromatic plates. Instead of presenting the 
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ten cards in their usual order from I through 
X, a 10 by 10 latin square was constructed 
(5). Ten distinct sequential orders were ar- 
ranged in which each card preceded and fol- 
lowed every other card just once, and each 
card appeared once in each column and once 
in each row. To each of these discrete orders, 
an S was randomly assigned and presented 
with the plates, either standard or achromatic, 
peculiar to that order. The same latin square 
was replicated seven times by the random as- 
signment of additional Ss to each of the ten 
orders of presentation. Eight Ss responded to 
each of the ten dissimilar orders of presenta- 
tion. 

In both Goal-Oriented and Goal-Disguised 
groups, half the Ss were presented the Ror- 
schach test prior to the work tasks, while for 
the other half, the work tasks were followed 
by the Rorschach test. 

Both the Rorschach test scores (number of 
‘responses per card) and the work-task scores 
(number of seconds per card) were treated 
mathematically by the analysis-of-variance 
technique, followed by ¢ tests when appro- 
priate. 

The Rorschach Goal-Spurt ratio is com- 
posed of the number of responses to the cards 
in positions eight, nine, and ten divided by 
the total number of responses to the cards in 
positions one through ten. 

The work-task Goal-Spurt ratio is com- 
posed of the number of seconds on cards 
eight, nine, and ten divided by the number 
of seconds on cards one through ten. A ratio 
was calculated for each of the three work 
tasks and averaged into a single ratio. 

It was then determined whether an incre- 
ment in a particular S’s productivity on the 
work tasks, as demonstrated by a decline in 
her time scores, was related to an increment 
in her productivity on the Rorschach, as 
demonstrated by an increase in the number 
of her responses. Each individual was ranked 
on the basis of the extent of her Rorschach 
and average work-task Goal-Spurt ratios and 
the correlation determined by the rank-differ- 
ence method. 


Results 


Before combining the data from the four 
separate latin squares of the “Goal-Oriented” 


The Goal-Spurt Hypothesis and the Rorschach Test 


Table 1 


Oriented Group Analysis of Variance Results 

















Mean 
Independent observations : df square F 
Order of presentation 9 14.74 1.35 
Residual between individuals 

(error) 30. = 10.92 

Total between individuals 39 

Mean 
Correlated observations : df square PF 
Cards 9 16.82 12.84** 
Positions 9 21.51 16.42** 
Residual from latin square 72 1.32 1.01 
Residual within individuals 270 1.31 
Total within individuals 360 
Total 399 





** Significant at 1 per cent level. 


group and the four of the Goal-Disguised 
group, they were tested for homogeneity of 
total variance by means of Bartlett’s test (2). 
The resulting chi square values of 6.16 and 
5.58 were not significant and offered no evi- 
dence against the hypothesis of random sam- 
pling from a common population. 

After combining the data from the separate 
latin squares of the Goal-Oriented group it 
was further ascertained that differences in 
productivity between the sequences (or pairs 
of Ss responding to the particular order in the 
latin square) were not significantly different. 


Table 2 


Disguised Group Analysis of Variance Results 
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Mean 
Independent observations: df square F 
Order of presentation 9 6.05 A6 
Residual between individuals 

(error) 30.—s 113.21 

Total between individuals 39 

Mean 
Correlated observations: df square F 
Cards 9 15.00 12.00°* 
Positions 9 2.35 1.72 
Residual from latin square 72 1.44 1.15 
Residual within individuals 270 1.25 
Total] within individuals 360 
Total] 399 





** Significant at 1 per cent level. 
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Table 3 


Aggregate Sums of Responses to 
Rorschach Test Cards 








Standard Achromatic 





Cards Cards Totals 
Goal Disguised 528 515 1,043 
Goal Oriented 579 572 1,151 
Totals 1,107 1,087 2,194 





The insignificant F value of 1.35 in Table 1 
demonstrates this fact. Similarly, there were 
no significant differences between sequences in 
the Goal-Disguised group as the F value of 
46 in Table 2 indicates. 

Table 3 reveals that total productivity in 
the Goal-Oriented group was slightly greater 
than productivity in the Goal-Disguised group 
although this difference did not meet statisti- 
cal significance (¢ = .63). The factor of non- 
orientation to a goal may have accounted for 
the smaller number of responses. 

Productivity on the achromatic plates was 
not significantly greater than on the standard 
plates (¢=.31). This comparison is also 
demonstrated in Table 3 and lends credence 
to the belief that the presence or absence of 
color neither stimulates nor inhibits produc- 
tivity. 

The productiveness of the Ss in the Goal- 
Oriented group was greater in the later se- 
quential positions. The mean number of re- 
sponses to the first seven positions was 49.4 
while that of the last three positions was 76.5. 
The ¢ value calculated for the difference be- 
tween these means was 3.92, which is sig- 
nificant at the 1 per cent level. Testing the 
significance of the mean square for total re- 
sponses by positions against the residual mean 
square within individuals, it was found that 
the obtained value of F (16.42) is highly 
significant for 9 and 270 degrees of freedom 
(Table 1). 

In the Goal-Disguised group, the increment 
in productivity did not appear in the later se- 
quential positions. In fact, there was a rela- 
tive decrement in these later positions. The 
mean number of responses to the first seven 
positions was 53.8 while that of the last three 
positions was 48.2. The ¢ value calculated for 
the difference between these means was 1.06 
which does not reach statistical significance. 
This is also demonstrated by the insignificant 


F value (1.72) obtained between the mean 
square for total responses by positions against 
the residual mean square within individuals 
(Table 2). 

Another major factor of interest was the 
effect of the cards themselves. When tested 
for significance against the residual mean 
squares, the mean squares for cards were also 
found to be highly significant. In the Goal- 
Oriented group the obtained F value was 
12.84 (Table 1) and that of the Goal-Dis- 
guised group was 12.00 (Table 2). Both of 
these values were significant at the 1 per cent 
level for 9 and 270 degrees of freedom. Card 
X, in contrast to Cards VIII and IX which 
are also endowed with color in the standard 
plates, was by far the most provocative of 
the series. Of special importance is the fact 
that Card X of the achromatic plates was 
similarly the most provocative of this se- 
ries. These findings clearly demonstrate that 
neither the presence or absence of color, nor 
the factor of goal orientation, accounts for 
Card X’s evocative properties in this study. 
It is likely that the unique quality of Card X 
is its configuration and this is the crucial fac- 
tor facilitating the Ss responsiveness. 

Whether the work tasks preceded or fol- 
lowed the Rorschach was not related to any 
differences in Rorschach productivity in either 
the Goal-Oriented or the Goal-Disguised 
group as indicated by the respective F values 
of .13 and .49. 

In the Goal-Disguised group, Rorschach 
productivity was not affected differentially 
(F = .80) by hiding the Rorschach cards 
from view or presenting them in a stack of 
twenty. 

Each of the three work tasks in the Goal- 
Oriented group was tested separately by the 
analysis-of-variance technique. Although none 
of the obtained F values approached the point 
of statistical significance, there were incre- 
ments in productivity as the goal was ap- 
proached. This increment began to occur in 
the ninth positions (trials) of the Digit- 
Symbol and Successive Additions work task 
and in the eighth position of the Color-Nam- 
ing task. 

The three work tasks in the Goal-Dis- 
guised group were also tested separately by 
the analysis-of-variance technique. Although 
none of the obtained F values were signifi- 
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The Goal-Spurt Hypothesis and the Rorschach Test 





Table 4 


Values of ¢ for the First Seven Versus the Last Three Positions for Rorschach and Work Tasks 











Goal] Goal 
oriented p disguised p 
Rorschach 3.921 01<p<.001 1.063 50<p<.10 
Digit Symbol 1.909 10<p<.05 .206 .90<p<.50 
Color Naming 2.233 10<p<.05 214 90<p<.50 
Successive Additions 2.216 10<p<.05 469 90<p <.50 








cant in any of the work tasks, there were 
decrements in productivity in the later trials 
of all tasks. 

The values of ¢ for the first seven vs. the 
last three positions for the Rorschach test 
and work tasks are shown in Table 4. 

The Goal-Spurt ratios were determined for 
each S’s Rorschach test performance and r.ork- 
task performance. They were then ranked in 
their order of magnitude. 

In the Goal-Oriented group, the correlation 
between Rorschach and work-task Goal Spurts 
for those Ss who received the standard Ror- 
schach plates was .572. This was significant 
at the .01 level. The correlation between Goal 
Spurts of the Ss who received the achromatic 
Rorschach plates and the work tasks was .477. 
This was significant at the .05 level. 

In the Goal-Disguised group, the relation- 
ship between Goal Spurts was more erratic. 
The correlation coefficient between Goal 
Spurts in the standard Rorschach plates and 
the work tasks was — .024. For those Ss re- 
ceiving the achromatic plates and the work 
tasks, the coefficient was .098. Neither was 
significant. 


Discussion 


Rather than account for the increment in 
productivity in the all-color cards (Cards 
VIII, IX, and X) on the assumption of “af- 
fectivity” in response to color, the following 
interpretation seems more appropriate: 

It is assumed that most Ss initially in- 
terpret the Rorschach testing procedure as 
threatening, or at best, challenging (8). This 
may be a function of various factors such as 
the fact that the S is being timed, that she 
does not know how many or how few “things” 
she should see, or what “things” she should or 
should not see. The fact that it is a “mental” 
test which might expose her inadequacies may 





be anxiety-arousing. The fact that the large 
number of cards before her will involve much 
time and energy expenditure may in itself be 
resistance-arousing. As the S soon learns that 
any and all of her responses are reacted to 
unemotionally in a neutral, accepting atmos- 
phere, she becomes more comfortable and the 
number of her responses may increase and 
then remain fairly constant. As she ap- 
proaches the end of the series of test cards, 
she may feel relieved that she is about to 
complete the task. Along with this comes the 
awareness that the last cards provide the final 
means for demonstrating her ability as ex- 
emplified by the number of her responses. It 
follows that the S’s goal is simultaneously to 
produce more responses and to remove herself 
from the situation. The valence of the incen- 
tive is dependent upon her interpretation of 
the situation and the strength of her reac- 
tions to it. The extent of the increment or 
decrement in responses at the terminal point 
is, then, a reflection of the S’s evaluation of 
goal attainment and her ability to react and 
compensate accordingly. Since the test in- 
volves the circumvention of barriers (stimu- 
lus plates) between the S and the visible goal 
(completion of the task), the reaction to the 
perceived goal should have characteristics 
similar to that seen in learning situations 
(Goal Gradient Phenomena) and in work 
situations (End Spurt). The increment or 
decrement in responses can be measured by 
the number of responses to Cards VIII, IX, 
and X divided by the total number of re- 
sponses to Cards I through X. This ratio is 
interpreted as an expression of the “Goal 
Spurt.” 


Summary 


The purpose of this study was to determine 
the effect of knowledge of a goal upon pro- 
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ductivity in the Rorschach test and in a se- 
ries of mental work tasks. The data were ob- 
tained from eighty Rorschach-naive student 
nurses. They were randomly assigned to a 
Goal-Oriented group, which was informed of 
the number of trials in each task, and a Goal- 
Disguised group in which no information rela- 
tive to the number of trials was communi- 
cated. 

Equal nu.nbers of Ss in both the Goal-Ori- 
ented and Goal-Disguised groups were pre- 
sented witt either standard or achromatic 
Rorschach plates. Each series of ten test 
plates was drawn from a predetermined latin 
square in which the plates were arranged into 
ten distinct systematically randomized se- 
quential orders of presentation. In this man- 
ner each plate appeared in all positions and 
preceded and followed every other plate an 
equal number of times. 

Three mental work tasks were also admin- 
istered to each S in either the Goal-Oriented 
or Goal-Disguised condition. The ten trials 
(cards) of each work task provided a means 
of comparison between productivity in any 
of the ten trials (positions) of the work tasks 
or the ten trials (cards) of the Rorschach 
test. 

The Goal-Spurt ratio was defined as the 
ratio between productivity in work-task or 
Rorschach test positions eight, nine, and ten 
divided by productivity in positions one 
through ten. Ratios were calculated for each 
individual’s Rorschach performance and her 
average work-task performance. 

Within the limits imposed by the homo- 
geneity of the sample used, examination and 
statistical analysis of the data revealed that: 

1. In the Goal-Oriented group, irrespective 
of the presence or absence of color in the Ror- 
schach plates, the position was of major im- 
portance with later-appearing plates produc- 
ing more responses than earlier-appearing 
ones. 

2. In both the Goal-Oriented and Goal- 
Disguised groups, irrespective of the presence 
or absence of color, and independent of their 
orders of sequential relationships, the plates 
themselves differed in the number of responses 
elicited. Card X, because of its unique con- 
figuration, evoked more responses than any 
other. 


3. In the Goal-Oriented group there was a 
significant relationship between each S’s Ror- 
schach and work-task Goal-Spurt ratios. This 
relationship did not exist in the Goal-Dis- 
guised group. 

Thus, it was shown that there is a strong 
positional effect present in both the Ror- 
schach test and a selected group of visual- 
motor perceptual tasks. This effect is demon- 
strated only when the Ss are made aware of 
the number of trials necessary to complete 
the task. In the orthodox order of presenta- 
tion of the Rorschach plates, Card X, by na- 
ture of its unique configurational properties, 
and its final position in the series, facilitates 
an increment in responsiveness. 

For these reasons, the “affective” ratio, or 
the increment in productivity on Cards VIII, 
IX, and X, is better explained on the basis 
of the Goal-Spurt formulation. This hypothe- 
sis is more parsimonious, is objectively dem- 
onstrable, is based upon studies concerned 
with the laws of learning and mental work 
and, quite apart form the various qualities 
attributable to color, explains the increment 
in productivity in the later-appearing cards. 


Received September 15, 1955. 
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Perceptual Judgment and Associative Learning 
Ability of Schizophrenics and Nonpsychotics’ 


Jay L. Chambers 


University of Kentucky 2 


Kasanin (5) and others have tried to ana- 
lyze the thought processes of schizophrenics. 
Many of their observations suggested a lack 
of critical judgment in schizophrenics’ think- 
ing, particularly for complex problems such 
as planning for the future. Whether schizo- 
phrenics would also demonstrate impairment 
for simple judgment problems was the pri- 
mary concern of this study. 

A secondary aim was to determine if schizo- 
phrenic intellectual impairment also affected 
learning capacity. Several studies have re- 
ported little or no impairment of learning 
ability among schizophrenics but the controls 
in most of these researches were lacking or 
inadequate, making the results inconclusive 
(1, 2, 3, 4). Therefore, a testing of associa- 
tive learning among schizophrenics was in- 
cluded in the study. A problem involving both 
learning and judgment was added in the 
eventuality that a task combining these proc- 
esses would be necessary to reveal an impair- 
ment in schizophrenic thinking. 

The degree of complexity of a judgment 
was considered, for the purposes of this in- 
vestigation, to depend upon the number of 
variables involved in a judgment. For exam- 
ple, a simple judgment would involve only 
one variable such as the use of IQ alone in 
judging students’ academic potential. A more 
complex judgment would require the com- 


1 Based upon a Ph.D. dissertation performed at 
the University of Kentucky. The writer wishes to 
express appreciation to Drs. James S. Calvin, Gra- 
ham B. Dimmick, Martin M. White, R. S. Allen, 
Ernest Meyers, and A. Dudley Roberts for assist- 
ance on the thesis. The writer is also indebted to the 
Lexington and Louisville Veterans Administration 
Hospitals, where the subjects were obtained. 

2 Now at Muskingum College. 
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bining of several variables for the judgment, 
as in predicting students’ academic ability on 
the basis of the combined variables of intel- 
ligence, motivation, educational background, 
etc. 

Perceptual judgments of spatial dimensions 
(e.g. length, area, etc.) were selected for the 
judgment problems as they afforded objective 
criteria, neutral or impersonal stimuli, and 
represented a basic type of judgment problem 
practiced by all cultures and individuals. 

With these considerations in mind a pro- 
cedure was designed to determine, if possible, 
whether schizophrenics, as compared with 
nonpsychotics, would demonstrate (a) im- 
pairment in complex but not in simple judg- 
ment problems, (4) impairment in judgment 
but not in learning, and (c) impairment in a 
task combining both judgment and learning 
but not in either ability tested separately. 


Method 


In order to test the various hypotheses 
proposed, seven tasks were devised. Each in- 
volved the sorting of geometric designs into 
five categories. The designs were drawn in 
black ink on 2144” X 34%” white index cards 
and were presented by E, who held them up 
one at a time for S’s inspection and then 
placed them in the category chosen by S. In 
each task, the order of the designs was sys- 
tematically varied so that no patterning oc- 
curred. 

The perceptual judgment tasks were con- 
structed, by preliminary experimentation with 
a group of normal subjects, to be of approxi- 
mately equal difficulty. The difficulty of dis- 
criminating between consecutive gradations of 
the stimuli was held constant by maintaining 
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a constant ratio of increment between the 
gradations. 

Following are brief descriptions of the tasks, 
in order of their presentation. 


Task 1; Associative Learning 


This task required S to learn to sort five designs 
(a circle, a square, an “x,” a horizontal line, and a 
vertical line) into five plain boxes. As there was 
little similarity between the designs, the task was a 
relatively “pure” learning problem with a minimum 
of discrimination required to differentiate the stimuli. 
Learning was by trial and error and after each choice 
E informed S of the correct box for the design. Each 
design was presented ten times in the task. 


Task 2: Simple Judgment of Density of Dots 


For this task S was required to discriminate and 
sort five gradations of density of dots presented on 
the index cards. The gradations were numbered “1” 
through “5” in order from the least to most dense. 
The use of numbers provided an already learned 
order of symbols. To provide an initial standard of 
judgment E showed S a set of the five gradations. 
The standard was then removed, representatives of 
the five gradations were presented successively by EZ 
to S, and S was required to judge each card for its 
position in the series of gradations. Each gradation 
was presented ten times in the task. 


Task 3: Simple Judgment of Area of Circles Com- 
bined With Associative Learning 


This task required S to judge five gradations of 
circles and to learn to refer to each gradation by 
the name of a color. (Color names for the grada- 
tions from smallest to largest were “green,” “blue,” 
“brown,” “yellow,” and “red.”) These color referents 
were used for the remaining tasks. A standard of 
judgment was initially presented as in Task 2. Each 
gradation was presented ten times in the task. 


Task 4: Simple Judgment of Degree of Angles 


Five gradations of degree of angle were judged in 
this task. Task 4 did not require learning a new set 
of symbols for reference to the gradations, and was 
therefore a relatively “pure” judgment task. A stand- 
ard of judgment was initially presented as in Task 2. 
Each gradation was presented ten times in the task. 


Task 5: Simple Judgment of Length of Lines 


This task was essentially the same as Task 4 ex- 
cept that five gradations of length of lines were used 
as stimuli. 


Task 6: Review of Tasks 3, 4, and 5 


Six trials for each gradation of circles, angles, and 
lines were presented to renew experience with these 
stimuli, because the complex judgment task to follow 
depended on approximately equal familiarity with 
each type of stimulus. 





Jay L. Chambers 


Task 7: Complex Judgment Combining Three Vari- 
ables 


This task was designed to test S’s ability to com- 
bine or integrate three variables in making a com- 
plex judgment. For each trial (63 trials), S was 
shown a card on which the three variables (a circle, 
an angle, and a line from the preceding judgment 
tasks) were all presented. The S was instructed to 
consider all three variables in making a judgment 
but was required to respond to the card as a unit 
by giving only one estimate of its position in the 
series of five gradations. No corrections were given 
on this task. For the first fifteen trials (Part a), 
the variables were in agreement with regard to the 
position in the series represented. For the remaining 
48 trials (Part b), one of the cues was opposed to 
the other two. 


The total errors constituted the accuracy 
score for each task. An error was defined as 
the difference between the position of a stimu- 
lus in the series and S’s judgment of its po- 
sition. For Task 7b, the two cues in agree- 
ment were taken as the correct position of 
the stimulus. 

For comparison of speed differences, the 
number of seconds to complete each task was 
recorded for each S. 

Subjects 

The Ss were all male veterans having been 
in the armed forces during World War II or 
later. All but four of the subjects were hos- 
pitalized at either the Lexington, Kentucky 
or the Louisville, Kentucky Veterans Ad- 
ministration Hospitals. The four nonhospital- 
ized veterans matched in other relevant re- 
spects the hospitalized subjects. 

The Ss were divided into two groups, (a) 
those psychotic at the time of testing, and 
(5) those who were not psychotic at the time 
of testing and who had no record of /aving 
been psychotic. All the psychotic patients 
carried an official diagnosis of schizophrenic 
reaction except one patient who later carried 
a mixed diagnosis including chronic brain 
syndrome. The nonpsychotics carried a va- 
riety of official diagnoses, the majority fall- 
ing into either character disorder or neurotic 
categories. The two groups did not differ sig- 
nificantly with regard to age or intelligence. 
Each group was composed of 30 Ss. 

In addition to the official diagnosis, the 
psychotic subjects were selected on the basis 
of the ward psychiatrist’s opinion that, de- 
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Table 1 





Error Scores for Learning and Judgment Tasks 





Schizophrenic 
(N =30) 
Type of task Mean SD 
Associative learning 
1. Symbols 54.2 22.2 
Simple judgment 
2. Dots 49.0 18.2 
3. Circles 56.2 18.5 
4. Angles 59.7 20.5 
5. Lines 51.9 22.6 
6. Review 104.3 12.6 
Complex judgment 
7a. {Circles ) 16.0 7.7 
{ Angles } 


7b. | Lines 37.0 14.5 


Nonpsychotic 
(N =30) 

sduaittnataedeaale . Level of 
Mean SD t sig 
55.1 20.2 17 

30.2 5.7 5.31 001 
45.2 12.6 2.63 02 
49.4 18.9 1.97 06 
37.3 15.0 2.89 01 
76.1 31.0 3.00 O1 
10.0 4.8 3.51 001 


18.7 15.7 4.59 001 





spite active psychosis, mental deterioration 
and incapacity was minimal. Only patients 
from active treatment wards were used. No 
patients undergoing electric shock treatment 
were used. 


Results 


The data of Task 1, presented in Table 1, 
indicated no significant difference between 
schizophrenics and nonpsychotics for accu- 
racy of associative learning. The schizophren- 
ics were consistently poorer in accuracy for 
all of the perceptual judgment tasks, how- 


ever. These differences were significant at the 
.02 level or better, with the exception of Task 
4 (simple judgment of degree of angle) where 
the difference attained a probability of .06. 

The data presented in Table 2 indicated 
that the schizophrenics were slower on all the 
tasks. These differences were significant at 
the .05 level or better with the exception of 
Task 7, where the differences did not attain 
significance. A trend toward decreasing sig- 
nificance for speed differences, as reflected by 
the size of ¢, may be noted as the tasks pro- 
gressed in order of presentation. 


Table 2 





Time Scores in Seconds for Learning and Judgment Tasks 











Schizophrenic Nonpsychotic 
(N=30) (N =30) 
ee Tear eT 4 Se un ee emer gem, Level of 
Type of task Mean SD Mean SD t sig. 
Associative learning 
1. Symbols 334.5 195.2 248.2 84.6 2.18 05 
Simple judgment 
2. Dots 326.0 185.8 212.7 35.1 3.22 01 
3. Circles 343.8 129.9 259.3 58.9 3.18 01 
4. Angles 310.5 102.4 252.5 54.4 2.69 01 
5. Lines 287.7 94.7 233.7 44.0 2.78 01 
6. Review 531.7 172.0 449.2 91.6 2.27 05 
Complex judgment 
7a. {Circles 92.0 62.2 70.5 18.7 1.78 10 
{ Angles > 
7b. \Lines | 282.3 187.3 234.7 74.5 1.27 30 
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Discussion 


If accuracy is used as the criterion for 
learning and judgment ability, then the re- 
sults of this study support the previous evi- 
dence cited from the literature. Schizophren- 
ics showed no impairment of accuracy in 
learning, but were consistently less accurate 
than the nonpsychotics in the judgment 
tasks. In speed, however, the schizophrenics 
were slower on both learning and judgment 
tasks. 

The finding that schizophrenics were im- 
paired in accuracy of judgment but not in 
learning might be interpreted as due to dif- 
ferences in motivation rather than ability. 
Perhaps the schizophrenics were more mo- 
tivated at the beginning of the testing and 
thus performed more accurately on the learn- 
ing task than on the later judgment tasks. 
No consistent preferences were expressed by 
either group for the type of task performed 
but further research, varying the order of 
presentation of the tasks, would better test 
an interest or motivation influence. 

The assumption that schizophrenics would 
demonstrate impairment for complex but not 
for simple judgment problems was not sup- 
ported by the data. Schizophrenics were sig- 
nificantly poorer than nonpsychotics on both 
simple and complex judgment tasks. There- 
fore, there was no indication that degree of 
complexity of judgment would be a signifi- 
cant variable for differentiating these two 
groups. 

An interesting aspect of the data was that 
schizophrenics and nonpsychotics were dif- 
ferentiated through the use of perceptual- 
type judgment problems. Clinical workers 
have long felt that schizophrenics revealed 
inadequacies in judgment of social situations 
and everyday life problems. Unfortunately, 
evaluation of such judgments is necessarily 
crude and unreliable. Tests of perceptual 
judgment of physical dimensions may be just 
as effective for diagnostic purposes and are 
more accurately and objectively evaluated. 

The significant difference between the groups 
for Task 2, which was primarily a judgment 
task and required little learning, indicates 


that a combination of judgment and learning 
was not necessary to demonstrate impair- 
ment in schizophrenics. Apparently a judg- 
ment factor alone was sufficient to differenti- 
ate schizophrenics and nonpsychotics. Also 
the fact that the schizophrenics continued to 
be poorer in accuracy of judgment after 
Task 3, where the “color names” had become 
learned, offers further reason for rejecting 
this hypothesis. 


Summary 


A group of 30 well-preserved, but actively 
psychotic, schizophrenic subjects were com- 
pared with an equal number of nonpsychotic 
subjects for performance on associative learn- 
ing and perceptual judgment tasks. 

The results tended to confirm other re- 
search indicating impairment in schizophrenic 
judgment without loss of accuracy in learn- 
ing. However, the schizophrenics were signifi- 
cantly slower on both learning and judgment 
tasks. 

The finding that perceptual-type judgment 
tasks differentiated schizophrenics and non- 
psychotics may be used to develop objective 
and accurate judgment tests for diagnostic 
purposes. 

A combined judgment and learning task 
failed to disclose differences between the 
groups of any greater significance than was 
obtained by tasks which emphasized judg- 
ment alone. 


Received August 23, 1955. 
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Determination of Defense Mechanisms for 
Conflict Areas from Verbal Material’ 


Morton Wiener, Bruce Carpenter,’ and Janeth T. Carpenter * 
Central State Hospital, Indianapolis 


This study was undertaken to develop cri- 
teria for determining from verbal material two 
general classes of defense mechanisms—re- 
pressive defenses and sensitizing defenses. 
These criteria seem to have application to 
several areas of research. For example, a 
study on perceptual defense utilizing these 
criteria to select subjects has been reported 
by Carpenter, Wiener, and Carpenter (3). 
This technique might also be used to study 
changes in defense behavior as a consequence 
of therapy or for any research problem where 
defense mechanisms are to be determined 
from verbal material. 

The concept of repression has been widely 
accepted in the literature, and there are many 
suggestions as to the nature of repressive 
mechanisms. For example, Maslow and Mit- 
telman (8, p. 608) define repression as “the 
rejection and shutting out of the awareness 
of a reaction pattern (thought, feeling, im- 
pulse, memory) in order to avoid distress’; 
Shaffer (9, p. 211) defines it objectively as 
“a failure to make a certain response, when 
the stimuli are presented that might be ex- 
pected to invoke it”; Thorpe (10, p. 149) 
defines it as “the process by means of which 


1This study was carried out as part of a larger 
project with the permission and encouragement of 
Clifford L. Williams, M.D., superintendent of Cen- 
tral State Hospital. Grateful acknowledgement is 
made to Dr. Charles C. Josey of Butler University 
for making his students available to serve as Ss for 
this study. The authors wish to thank Mr. Austin 
Jones for serving as a judge and for critically read- 
ing a draft of this paper. Appreciation is expressed 
to Mr. Earl Furlow for assistance in the preliminary 
selection of the scoring criteria, and to the other 
nine psychologists who discussed the criteria in a 
seminar. 

2Now at Florida State University. 
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the individual defends himself against being 
confronted with unwelcome thoughts and de- 
sires.” Similar definitions have been given by 
others (4, p. 220; 11, p. 240). 

Thus the concept of repression seems well 
established. The concept of sensitization as a 
defense mechanism has a much shorter his- 
tory. Bruner and Postman (1, 2) used the 
term “perceptual sensitization” to describe 
the unexpectedly short perceptual reaction 
time of some Ss to “dangerous” stimuli. They 
postulated a selective vigilance on the part of 
an organism which is characterized by meet- 
ing the “dangerous” stimulus with utmost 
alertness and speed. The present authors de- 
fine sensitization as a heightened vigilance 
which operates when the environment sug- 
gests the presence of threatening material; 
the individual is sensitive to the conflictual 
stimulus but to decrease the anxiety aroused 
by it, he changes the significance of the stimu- 
lus (e.g., intellectualization, undoing, displace- 
ment, projection). 

The primary purpose of this study is to de- 
termine whether a reliable criterion can be es- 
tablished for determining repressive defenses 
and sensitizing defenses from verbal mate- 
rial. Secondarily, the question of whether 
there is a generalized defense which indi- 
viduals use for several areas of conflict or 
anxiety will be investigated. 


Method 


Subjects. One hundred forty undergradu- 
ates in a beginning psychology course made 
up the experimental sample. 

Procedure. A special sentence-completion 
blank was devised for use in determining de- 
fense mechanisms. The sentence-completion 
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blank was self-administered by each S out- 
side of class. The instructions were: “Com- 
plete these sentences to express your real feel- 
ings. Try to do every one. Be sure to make a 
complete sentence.’”’ The sentence completion 
blank consisted of the following twenty stems, 
mimeographed with triple spacing between 
stems: 


1. Sports are 11. Dating 

2. I don’t want to know 12. People who neck 
3. Dancing 13. It bothers me 

4. Reading is 14. I get mad 

5. I failed 15. I really feel 

6. There are 16. Men and women 
7. I resent 17. What annoys me 
8. I hate 18. I secretly 

9. Sex is 19. The birds 

10. Walking 20. Deep down I 


Five of the stems (Nos. 3, 9, 11, 12, and 16) were 
designed to elicit sexual content. Five stems (Nos. 
7, 8, 13, 14, and 17) were designed to elicit content 
in the hostility area. Another five stems (Nos. 2, 5, 
15, 18, and 20) were designed to elicit feelings about 
the self. The other five stems (Nos. 1, 4, 6, 10, and 
19) were designed to be neutral or relatively in- 
nocuous. The stems were all relatively mild so that 
the Ss would not feel discomfort in filling out the 
blank. The sex and hostility stems were intended to 
represent several different intensities and were in- 
tended to lie along a gradient from less to more 
conflictual in our culture. 


In scoring the completed stems another ma- 
jor conflict area emerged in addition to the 
areas of sex and hostility. Many of the stems 
appeared spontaneously to elicit expression of 
feelings of inadequacy. Since in this college 
group adequacy may be an important area of 
conflict, it was decided to score the sentence 
completions for sensitization to inadequacy 
also. 

In scoring the sentence completions, the 
first discrimination attempted by the judges 
was the presence or absence of conflict in the 
response. Two criteria for making this dis- 
crimination were adopted: (a) if a defense 
mechanism clearly was observable in the re- 
sponse, it was scored as conflictual; and (6) 
if the response to the stimulus was an over- 
reaction, underreaction, avoidance, or other 
inappropriate reaction to the stimulus, the re- 
sponse was scored as conflictual. 

The experimental purpose was to dichoto- 
mize conflict responses into two broad classes 
of defense mechanisms, those with repressive 


characteristics and those with sensitization 
characteristics. The following criteria were 
used to score responses as repressive. Exam- 
ples are given in parentheses: 


a. The use of clichés. (J hate war; Sex is here to 
stay.) 

b. Denial of stimulus implication. (J hate no- 
body.) 

c. Avoidance of stimulus. (J kate asparagus.) 

d. Blocking (no response). 

e. Distancing from personal involvement. (J hate 
to think of how cold it is in Greenland.) 

f. Very limited generalizations. (Jt bothers me 
when someone pops gum in my ear.) 

g. Minimization of involvement in the conflictual 
activity. (Sex is something I won’t know about till 
I’m married.) 

h. Obligation, duty, imposed acceptability by au- 
thority. (Dancing is something young people should 
learn.) 

i. Definitions which lead to avoidance of conflic- 
tual connotations of the stimulus. (Dancing is the 
movement of one’s feet.) 

j. Moralization, romanticization, naiveté, or ideali- 
zation. (Sex is one of the gifts of God that is to be 
used and enjoyed as He wishes.) 


The following criteria were used to score 
the sentence-completion responses as sensiti- 
zation responses. Examples are given in pa- 
rentheses: 


a. Statement of inadequacy or failure. (Sex is 
something I’m very bashful about.) 

b. Rationalization. (J failed my German exam, 
but it was because I was tired from working too 
hard.) 

c. Intellectualization. (Jt bothers me to read in 
the newspapers that the Communists have taken 
over another country.) 

d. Acting, then undoing or inhibiting. (J hate my 
brother, but only when he fights with me.) 

e. Displacement or projection of feeling to other 
people. (Jt bothers me to hear people make fun of 
another’s physical defect.) 

f. Preoccupation, including elicitation of a con- 
flictual response by a stimulus which is inappro- 
priate for that particular content area. (J hate to 
see two girls holding hands.) 

g. Projection of motives or feelings to others. (J 
hate people who deliberately hurt others.) 

hk. Use of humor with conflictual material. (Sex is 
something I don’t know much about—I’m too in- 
nocent.) 

i. Qualification. (People who neck are all right, if 
they don’t do too much.) 

j. Overreaction to stimulus. (J hate people.) 

k. Denial of importance. (Sex is over-emphasized 
in our society today.) 
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As a general scoring criterion, avoidance of 
the stimulus or denial of the activity implied 
in the stimulus was considered a repressive 
mechanism; denial of the importance or con- 
flictual nature of the activity was considered 
a sensitizing mechanism. 

There were eight possible judgments which 
could be made about each response. A re- 
sponse could be scored: (a) without conflict; 
(6) hostility conflict—sensitizer; (c) hos- 
tility conflict—represser; (d) sex conflict— 
sensitizer; (e) sex conflict—represser; (f) 
adequacy conflict—sensitizer; (g) sensitizer 
response without designation of area of con- 
flict; or (4) represser response without desig- 
nation of area of conflict. 

Four clinical psychologists acted as judges. 
Agreement between judges was arbitrarily de- 
fined as at least three of the four judges mak- 
ing the same judgment about a response from 
the eight possible judgments. 

Each S’s response to each stem was typed 
on a separate card. All responses to each 
item were placed together for judging. Three 
judges first independently scored all 2,100 
critical items. The five neutral items were 
not considered. Those items on which all 
three judges agreed were removed from the 
set and the remaining items were scored in- 
dependently by a fourth judge. Those items 
which reached the agreement criterion were 
used to assign scores to each S. Items on 
which judges did not agree or which were 
judged as not indicative of conflict were not 
included in the scores. Each S was given a 
Represser-Sensitizer (R-S) score for sex and 
an R-S score for hostility. Each score was 
based on the number of responses in a par- 
ticular conflict area judged as sensitizing and 
as repressive. The number of repressive and 
sensitizing responses were combined by a 
formula (S — R + 5) into single score vari- 
able with scale values ranging from 0 to 11. 
A 0 score indicates consistent repressive de- 
fenses in an area. A score of 11 indicates con- 
sistent sensitizing defenses in an area. A score 
of 5 was obtained by Ss who had an equal 
number of repressive and sensitizing responses 
in an area. Each S was also given a general 
R-S score based on the number of responses 
judged as repressive or sensitizing without 
designation of any particular area of conflict. 


The formula (S — R + 5) was used to obtain 
the general R-S score. Each S was given an 
Adequacy score; this score was the number of 
sentences in which an expression of inade- 
quacy was judged to have been elicited. 


Results 


Agreement between judges was obtained for 
72 per cent of the 2,100 sentences judged. Of 
the remaining 28 per cent, the judges split 
(with two judges making one judgment and 
two making another judgment) on 14 per 
cent of the items. The remaining 14 per cent 
showed less agreement than this. Examina- 
tion of the data indicated that certain sub- 
categories of the scoring criteria contributed 
more to the unreliability of judgment than 
did others. For example, it was relatively easy 
for the judges to discriminate reliably the 
“denial” and “blocking” categories, but was 
much more difficult for the judges to differ- 
entiate reliably between “intellectualization” 
(sensitizing) and “distancing from personal 
involvement” or “limited generalization” (re- 
pressive). 

To. determine whether there is a tendency 
for individuals to use a general class of de- 
fense ‘mechanism in different conflict areas, 
tabulation was made of the number of Ss 
using repressive mechanisms in both the 
sex and hostility areas, Ss using sensitizing 
mechanisms in both areas, and Ss _ using 
sensitizing defenses in one area and repres- 
sive defenses in the other. Table 1 shows 
this distribution. Those Ss having scores of 5 
(indicating an equal number of repressive and 
sensitizing responses) in either area are not 
included in this table. The difference between 
the number of Ss using consistent defenses 
and the number of Ss using different defenses 
is not significant. If Ss with scores of 5 (no 


Table 1 


Frequency of Ss Using Repressive and Sensitizing 
Mechanisms in the Sex and Hostility Areas 














Sex 
Hostility Represser Sensitizer 
Represser 23 32 


Sensitizer 12 26 





Note.—Chi square = 1.003, > .30. 
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Table 2 


Intercorrelations Among Sex R-S Scores, Hostility R-S 
Scores, General R-S Scores, and Adequacy 
Scores for 83 Subjects 











General 
Hostility R-S Adequacy 
Sex +.24* + .32* —.01 
Hostility +.55* +.25* 
General R-S +.25* 





* Significant at .05 level. 


consistent mechanism) in one area are in- 
cluded in an analysis of defense by area, only 
39 per cent of the 140 Ss show consistent use 
of the same mechanism from one of these 
areas to the other. 

Further investigation of the generality of 
defense mechanisms was made by correlating 
the scores obtained by 83 of the Ss* for the 
four areas—sex, hostility, general R-S, and 
adequacy. Table 2 shows these correlations, 
which range from — .01 to + .55. All corre- 
lations except that between the sex and ade- 
quacy areas are significantly different from 
zero. 


Discussion 


The degree of agreement obtained between 
judges suggests that responses to the sen- 
tence-completion task can be classified fairly 
reliably by use of the scoring criteria. While 
72 per cent agreement is far from perfect 
agreement, it seems adequate in view of the 
rather stringent agreement criterion adopted, 
of at least three of four judges making the 
same judgment from eight possible judgments. 

The results of this study by and large do 
not support a theory of generality of defense 
behavior. Although there appears to be a 
tendency for a large minority (39 per cent) 
of Ss to use the same defense mechanism for 
more than one area of conflict, the majority 
of the Ss did not consistently use one class of 
defense mechanism. In addition, although sig- 
nificant correlations were obtained between 
scores representing mechanisms used in dif- 


3 These 83 Ss were those for whom data were 
available for several measures in a larger study. Care- 
ful examination revealed no apparent differences be- 
tween those Ss included in the correlation matrix 
and those not included. 


ferent conflict areas, these correlations are 
quite low. There would be little improvement 
over chance in attempting to predict any one 
S’s mode of defense in one area from knowl- 
edge of his defense mechanism in another 
area. 

The findings in this study with regard to 
generality of defense are in agreement with 
the results of a study by Goldstein (5). Gold- 
stein found that the majority of his Ss did 
not exhibit consistent defense preference, al- 
though a large minority did. He concluded 
that there were two subgroups of Ss, “specific 
defenders” and a smaller group of “general 
defenders.” 

Additional but nonquantified evidence is 
available which argues against a theory of 
consistency of defensive behavior. It became 
apparent in examining the scoring for indi- 
vidual Ss that some Ss use different defense 
mechanisms for subclasses of stimuli within a 
single conflict area. For example, one S con- 
sistently used repressive defenses in respond- 
ing to heterosexual stimuli but spontaneously 
gave responses expressing concern about homo- 
sexuality to stimuli not expected to evoke a 
sexual response (by definition a sensitizing 
mechanism). Another S used sensitizing de- 
fenses in self-directed hostility responses, but 
used repressive defenses in other-directed hos- 
tility responses. 

The lack of evidence to support a theory 
of generality of defenses makes rather puz- 
zling the results of research employing Ss se- 
lected on the basis of diagnostic labels. For 
example, Lazarus, Eriksen, and Fonda (7) 
selected Ss diagnosed as “obsessive-compul- 
sive” and “hysteric” personalities. When given 
an auditory perceptual task, the obsessive- 
compulsive Ss more accurately identified 
threat words than did the hysteric Ss. If de- 
fenses are specific to conflict areas rather 
than general, why did the hysterics consist- 
ently show repressive behavior? One possible 
explanation is that there is greater consist- 
ency of defense behavior among groups diag- 
nosed as pathological than there is among 
“normals.” Goldstein’s (5) findings of greater 
“emotional disturbance” for the general de- 
fenders than for the specific defenders lends 
support to this hypothesis. However, Kurland 
(6) did not find significant differences be- 
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tween a pathological group and a normal 
group in response to “emotional” words on 
an auditory perceptual task. The question of 
whether pathological groups show more con- 
sistent defense behavior therefore remains un- 
resolved. 

Although reliability of judgment of the or- 
der found in this study is adequate for many 
research problems, greater reliability of judg- 
ments based on the criteria might be obtained 
if the subcategories of the criteria could be 
more sharply delineated. Experience gained 
through scoring the responses suggests that 
reliable differentiation of the subcategories 
depends upon response to extreme subtleties 
in verbal expression. As an example, to dif- 
ferentiate between “distancing from personal 
involvement” and “‘intellectualization,” it may 
be necessary to respond to the intensity of 
the verbalization. Experience suggests that an 
intellectualized response reveals considerably 
more intensity of involvement in the verbal 
material than does a repressive response. A 
response, “Jt bothers me to read in the news- 
paper that the Communists have taken over 
another country,” may be qualitatively dif- 
ferent from a response, “Jt bothers me to 
read about the Communists.” The specificity 
of the source, “the newspaper,” and of the 
disliked activity, “taken over another coun- 
try,” suggests more involvement in the verbal 
production. As this technique is used with a 
variety of Ss in a variety of situations, it may 
be possible to raise the reliability by delineat- 
ing the scoring categories in this manner. If 
reliability can be increased sufficiently, use 
of the technique as a clinical tool might be 
profitably investigated. 


Summary 


An attempt was made to devise a tech- 
nique which would permit specification of 








Defense Mechanisms for Conflict Areas 219 


kinds of defense mechanisms used by indi- 
viduals in various areas of conflict. Scoring 
criteria were developed to aid in identifying 
repressive and sensitizing responses to a sen- 
tence completion task. Judges were able to 
classify Ss reliably by use of the scoring cri- 
teria. Converting judges’ classifications to a 
single score variable allowed comparison of 
defense mechanisms used in several conflict 
areas. In general a theory of consistency of 
defense behavior was not supported. 


Received August 29, 1955. 
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The Relationship of Anxiety and “Lack of 
Defensiveness” to Intellectual Performance 


Irwin G. Sarason 


Indiana University 3 


Two studies have recently appeared in this 
journal dealing with the relationship between 
Taylor Anxiety Scale scores and ACE per- 
formance. Matarazzo et al. (3) found anxiety 
scores to be inversely related to ACE scores, 
while Schulz and Calvin (4) found no sig- 
nificant relationship between these two vari- 
ables. One deficiency common to these studies 
was the small number of subjects employed 
in certain of the anxiety groups. Thus, the 
low anxious group of Matarazzo et al. in- 
cluded only 13 Ss; Schulz and Calvin’s high 
anxious group contained only 8 Ss. One aim 
of the present paper is to report the results 
of a study similar to these but employing 
relatively large numbers of Ss in all anxiety 
groups. 

A further purpose of this study was to in- 
vestigate the relation to intellectual perform- 
ance of a defensive test-taking attitude as 
measured by the MMPI K scale. This seemed 
of interest because of the highly significant 
degree of relationship reported in several 
studies between the K scale and the Taylor 
Anxiety Scale. Pearson r’s of — .74 and — .81 
have been reported by Heineman (1) and 
Matarazzo (2), respectively. Also, by includ- 
ing both of these scales in this study it be- 
came possible to evaluate Matarazzo’s hy- 
pothesis that the Taylor scale and K may be 
used as alternate forms. If this is true, one 
would expect similar results for the two scales 
with respect to intellectual measures such as 
those used in this experiment. 


Method 
Subjects 


The Biographical Inventory (5) containing 
the Taylor and K scales was group-adminis- 


1 Now at VA Hospital, West Haven, Conn. 


tered to all Introductory Psychology students 
enrolled at Indiana University at the start of 
the Fall 1954 semester. The 719 Ss used in 
this study include all of those students who 
were entering freshmen and who completed 
the Fall semester. 


Procedure 


Two measures of intellectual functioning 
were employed: ACE raw scores and semester 
over-all grade-point averages. In relating the 
anxiety and K scales to the two intellectual 
measures, Ss were divided into 7 anxiety and 
4 K-scale groups. The Taylor-scale groups 
were ordered from low to high anxiety at in- 
tervals of five. That is, the lowest anxiety 
group included Ss with scores ranging from 1 
to 5, the highest group included Ss with scores 
ranging between 36 and 41. There were 29 
Ss in the lowest anxiety group and 34 Ss in 
the highest group. Each of the other five 
groups included at least 80 Ss. 

The following are the ranges of score for 
the four K scale groups: the lowest K group 
included all Ss with raw K scores between 1 
and 10, the highest included Ss with scores 
between 21 and 30. The remaininig two 
groups included Ss with score ranges of 11- 
15 and 16-20. The group of low K scorers in- 
cluded 134 Ss, the high K group contained 
61 Ss. The other groups each contained over 
200 Ss. 


Results 


Analyses of variance failed to reveal sig- 
nificant changes in grade-point average with 
increases in score on either the Anxiety scale 
or K scale. The F ratio for the anxiety groups 
was .80, while the F ratio for the four K 
groups was 2.20. This latter F was signifi- 
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Anxiety, Defensiveness, and Intellectual Performance 


Table 1 


Means and. Standard Deviations of ACE Scores and 
Grade-Point Averages for K Scale Groupings 











ACE G.P.A.* 
K-scale —— -— = 
score N Mean SD Mean SD 
1-10 134 99.55 25.92 1.22 .76 
11-15 296 108.23 22.92 1.35 .69 
16-20 228 109.17 22.04 1.42 .70 
21-30 61 107.90 23.82 1.36 .67 





* Grade of C = 1, B = 2, etc. 


cant between the .10 and .05 levels (3 and 
715 df). The Pearson r between K and grade- 
point average was found to be .07, which was 
not significantly different from zero. An in- 
significant F of .83 for the mean ACE scores 
of the seven anxiety groups indicated a lack 
of relationship between increases in Taylor 
score and ACE total score. When an F was 
computed for the mean ACE scores of the 
four K groups, an F of 5.48 was obtained. 
With 3 and 715 df, this F is significant at the 
.001 level. Table 1 presents the means and 
standard deviations of ACE scores and grade- 
point averages for the 4 K-scale groups. In- 
spection of this table clearly reveals that the 
significance of the F is due to the mean of 
the low K group which differs markedly from 
the means of the other K groups. The Pear- 
son r obtained between ACE and K scores 
was .09 (p < .05). 

The r computed between Taylor and K 
scores was found to be equal to — .62 (p 
< .001). An r was also computed between 
Taylor- and K-scale scores for all nonfresh- 
men (N = 448) in Introductory Psychology 
who took the Biographical Inventory at the 
same time as the Ss in this study. The ob- 
tained correlation was — .69, which is signifi- 
cantly different (p < .02) from the r of — .62 
for the freshmen. 


Discussion 


The results of this study are in agreement 
with Schulz and Calvin’s (4) demonstration 
of a lack of relationship between Taylor scale 
and ACE scores. The finding of Matarazzo 
et al. (3) that the higher the anxiety score 
the lower the ACE score was in no way sup- 
ported. As Schulz and Calvin suggest, one 
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explanation for contradictory outcomes of 
similar studies done at different institutions 
may well be due to extraneous factors asso- 
ciated with the institutions, e.g., difference in 
type of student body. It is interesting that the 
two studies indicating no relation between 
Taylor and ACE scores were done at large 
state universities, whereas the Matarazzo 
et al. study was performed at a private and 
relatively smaller institution. 

The results using the K scale are of sig- 
nificance in that they do not point to a 
gradual monotonic relationship between de- 
fensiveness scores and intellectual perform- 
ance. Both with the ACE and grade-point 
averages, it is clear that Ss in the low K 
group differ markedly from Ss in the rest of 
the distribution. 

The finding that low K Ss perform on a 
considerably lower level than all other Ss in 
the distribution leads, of course, to two pos- 
sibilities. The obtained result could simply 
be due to the fact that individuals with low 
scores on the K scale are not as bright as all 
other individuals. Another possibility, consid- 
ered more likely by this writer, is that low 
K scorers do poorly on tests like the ACE 
because of certain personality characteristics 
which are detrimental to good performance. 

It may be said of low K Ss that they are 
candid and self-critical in the extreme and 
manifest feelings of inadequacy by too freely 
admitting “bad” things about themselves. In 
a sense they are “lacking” in defensiveness. 
If this is so, it is plausible that these persons 
when confronted with a stressful test situa- 
tion in which they feel themselves being 
evaluated -(e.g., the ACE) tend to respond 
with self-depreciatory attitudes. They may 
verbalize to themselves and recall their short- 
comings (e.g., “I am dumb, I can’t do it”) 
while working at the task. To the extent that 
this proneness to self-criticism results in the 
production of responses (e.g., “Maybe I 
won't pass the test”) unrelated to the suc- 
cessful completion of the task, we would ex- 
pect it to be detrimental to a high level of 
performance. It is certainly hard to con- 
ceive of these interfering responses as being 
facilitative of high achievement. The result 
obtained in this study is what one would ex- 
pect; namely, a relatively low level of per- 





222 Irwin G. 


formance from those Ss who are overly self- 
critical. 

The fact that the F ratio among the K 
groups was significant for the ACE and only 
approached significance for the grade-point 
averages suggests an interesting theoretical 
question concerning the development of an 
essentially intropunitive self-critical reaction 
to highly motivating situations in which the 
individual feels his ability is being evaluated. 
If one assumes that this lowering of the 
threshold for self-criticism is a learned re- 
sponse to situations in which the individual 
feels under pressure to perform well, we 
would expect that other possible responses 
more relevant to the task in question might 
similarly be learned. That is, the fact that 
someone has learned a nonadaptive habit 
does not preclude his eventually learning an 
adaptive one. For instance, individuals who 
feel that they are inadequate to cope with 
stressful situations, such as taking an intelli- 
gence test, may, if in the situation long 
enough and given many trials, overlearn the 
relevant responses for high performance suffi- 
ciently so that the detrimental effect of the 
learned response of self-depreciation will be 
kept to a minimum. With regard to the pres- 
ent study, it is quite possible that the low 
K’s were able to reduce the effect of their 
interfering self-verbalizations over the course 
of a semester by means of overlearning the 
course subject matter. On the other hand no 
overlearning was possible in preparation for 
the ACE. 

A word remains to be said concerning the 
correlation of — .62 obtained between K and 
A. This correlation is considerably lower than 
those published in other studies (1, 2). In 
order to determine whether this low correla- 
tion was purely a characteristic of students 
at Indiana University or whether it might in 
part be due to the fact that all the Ss were 
freshmen, the correlation between the two 
scales was computed for nonfreshmen as well 
as freshmen. The obtained correlation of 
— .69 for the upper-class group, while still 
lower than other studies, is more in line with 
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correlations reported in the literature. It 
seems possible, therefore, that the two scales 
may become more closely associated as the 
length of time in college increases. We have 
already mentioned that the type of student 
body from which Ss are sampled can be an 
important factor and there are certainly other 
socioeconomic factors which are equally rele- 
vant. At the present time it would appear that 
Matarazzo’s suggestion (2) that the two 
scales be used as alternate forms is prema- 
ture. 


Summary 


This experiment was performed to evaluate 
the effects of anxiety (Taylor-scale scores) 
and defensiveness (MMPI K-scale scores) on 
intellectual performance (ACE scores and 
grade-point averages). The results failed to 
show significant changes in these two meas- 
ures of intellectual performance as a function 
of anxiety. However, Ss low in defensiveness 
were found to perform significantly more 
poorly on the ACE than did all other Ss in 
the K-scale distribution. A similar trend was 
found for the grade-point averages. The re- 
sults were discussed in terms of the adverse 
effect of the self-criticism of low K Ss on 
their performance in highly motivating situa- 
tions. 


Received September 12, 1955. 
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Evaluative Conceptualizations as the Basis for 
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In spite of the general admiration for “ob- 
jectivity” in psychology, it has frequently 
been observed that there is considerable room, 
especially in our clinical practices, for the ac- 
tion of “subjective,” and therefore perhaps 
also detrimental, influences (1, 2, 3, 6). The 
real question is not simply whether this “sub- 
jectivity” is actually present as a factor in 
clinical diagnosis, but whether clinical judg- 
ments are sufficiently reliable phenomena to 
be considered useful as a basis for making 
decisions about the human personality. 

Obviously, the problem is too broad to be 
handled in any single empirical investigation, 
but at least one important aspect of the clini- 
cal situation may be relatively easily ap- 
proached, and shows promise for fruitful re- 
search, viz., the basis upon which psycholo- 
gists make judgments regarding the “mental 
health” of their subjects. 

We will assume for the present that, on the 
basis of their training, clinical psychologists 
are capable of building up fairly accurate de- 
scriptive conceptualizations of their patients, 
and that they make judgments and evalua- 
tions of these subjects on the basis of a par- 
tially subjective comparison of these descrip- 
tive conceptualizations with some standard or 
“ideal” picture of mental health, adjustment, 
or what have you. It follows, then, that two 
psychologists may differ in their judgments of 
the same individual mainly because they have 


1Subproject of a dissertation presented for the de- 
gree of Doctor of Philosophy at Western Reserve 
University. The valuable advice and critical com- 
ments of Dr. D. W. Miles and the other members of 
the doctoral committee, Drs. C. S. Hall, C. F. Baker, 
C. R. Porter, and R. Fisher, are gratefully acknowl- 
edged. 


* Now at Highland View Hospital, Cleveland, Ohio. 


grossly different “ideals” against which the 
subject is being compared. It also follows 
that it would be worthwhile if we could make 
these ideal conceptions explicitly measurable 
so as to assess their internal reliability (con- 
sistency within any single psychologist) and 
the extent to which clinicians would be found 
to agree with each other. 

The purpose of the present research has 
been to use Q-sort method (5) in the ex- 
amination of psychologists “evaluative con- 
ceptualizations” and to determine the extent 
to which these may be said to represent re- 
liable and consistent diagnostic phenomena. 


Method 


Subjects. The subjects were twelve psy- 
chologists, all of whom were employed in the 
same state hospital setting. Of these twelve 
individuals, seven were full-time, permanent 
staff members and five were half-time gradu- 
ate students in clinical psychology. All but 
one were currently engaged in diagnostic and 
therapeutic work with psychotic patients. 
The students were under the supervision of 
experienced staff members at all times, and 
none of these students had been in graduate 
school less than two years at the time of this 
investigation. 

Measurements. The measuring device was a 
60-item, structured-sample Q sort, adminis- 
tered in two equivalent forms of 30 items 
each. Q-sort items were selected to fill a 3 x 
2 x 5 factorial scheme twice over. In this 
scheme, the first dimension reflected rigidity, 
flexibility, or disorganization of the person- 
ality processes; the second dimension re- 
flected the direction of psychological energy 
either toward the self or toward others; and 
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the third dimension defined specific item- 
content in terms of positive affective relation- 
ships, negative affective relationships, cogni- 
tive processes, aspirations, and techniques of 
adjustment. Details regarding the methods of 
scale construction, the means for establishing 
equivalent forms, and the nature of specific 
item-content are available elsewhere (4). 

Q-sort items were sorted by the subject 
into nine piles to describe two separately re- 
quested conceptualizations. The 30 items of 
one form were first placed into the required 
distribution, and the task was immediately 
repeated with the 30 items of the equivalent 
form. Order of presentation of the forms was 
randomly distributed. The nine piles used 
were quasi-normally distributed in frequency 
of card placement from “most descriptive” to 
“least descriptive,” and scores were assigned 
to each pile in the usual manner. The actual 
frequency distribution employed was: 1; 2; 
3; 5; 8; 5; 3; 2; 1; and the scores assigned 
to these categories ranged from O for the 
least descriptive item to 8 for the most de- 
scriptive one. 

Procedure. Each subject was asked to use 
both forms of the Q sort to describe his con- 
ceptualizations of a psychologically healthy 
person and an ideal hospitalized patient. Thus 
a total of 48 separate 30-item sortings were 
obtained. 

The two frames of reference in which the 
cards were sorted were specified in the follow- 
ing set of directions: 


1. Describe first a perfectly recovered individual, 
one for whom you would unhesitatingly recommend 
release from the hospital ...in short, a psycho- 
logically healthy person. 

2. Now describe an ideal hospitalized patient, one 
who might be considered for discharge if the home 
conditions were proper; one who would most likely 
get along well on the ward but who is not suffi- 
ciently healthy to warrant a recommendation for 
unconditional release. 


The descriptions provided by the subjects 
were first evaluated for intrasubject reliability 
on the two equivalent forms of the Q sort. 
The mean correlations between all possible 
pairs of subjects were then computed to yield 
indices of homogeneity, or intersubject reli- 
ability on both measures. 

An over-all analysis of variance was per- 
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formed to determine which groups of items 
were significantly high or low in their descrip- 
tive scores and whether differences could be 
found between the two sets of descriptions 
obtained. 

Finally, composite descriptions of each 
“ideal” were constructed by calculating the 
mean placement values (descriptive scores) 
for each item and by redistributing all the 
items back into the original quasi-normal dis- 
tribution according to their average values. 
These composite descriptions were intercor- 
related and compared in terms of item- 
placements as a means for specifying the 
similarities and differences between the two 
conceptualizations investigated. 


Results 


The mean intrasubject reliabilities of the 
described conceptualizations were found to be 
82 for the healthy person descriptions, and 
.80 for the descriptions of the ideal hospital- 
ized patient. (All figures are Pearson product- 
moment correlation coefficients.) These values 
compare satisfactorily with the reliabilities of 
many psychological instruments, and their 
magnitudes indicate that the subjects were 
describing conceptualizations which remained 
fairly stable, at least from one sorting to the 
next. 

The mean intersubject reliabilities (homo- 
geneities) were found to be .68 for the healthy 
person descriptions and .59 for the descrip- 
tions of the ideal hospitalized patient. These 
figures indicate a moderately high level of 
agreement between subjects, although they 
suggest that there were probably quite a few 
differences with respect to specific items. 

The intercorrelations of the composite de- 
scriptions constructed for both “ideals” were 
found to be .72 on one form of the Q sort 
and .67 on the other. Again, there is indica- 
tion that the two ideals are much the same 
in over-all characteristics, but there is no in- 
dication of agreement on all items of the 
measuring instrument. 

Analysis of variance of the obtained sort- 
ings revealed the following significant item 
categories, according to the Q-sort item struc- 
ture already outlined: 

Items reflecting flexibility, regardless of di- 
rection or content, were found to obtain sig- 
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Evaluative Conceptions 


nificantly high descriptive values in both con- 
ceptualizations. 

Items reflecting disorganization, regardless 
of direction or content, were found to obtain 
significantly low placement values in both 
conceptualizations. Especially low placement 
was given to items reflecting disorganization 
of the negative affective relationships. 

Items reflecting rigidity, regardless of di- 
rection or content, were found to obtain sig- 
nificantly low values only in the descriptions 
of the psychologically healthy person. 

What these findings mean in terms of spe- 
cific item content was determined by examin- 
ing the composite sortings to locate those Q- 
sort statements which received both highly 
consistent and highly inconsistent placement 
when one “ideal” was compared with the 
other. 

The following are examples of items con- 
sidered by the subjects to be “desirable” (as- 
signed to the upper extreme categories of the 
Q-sort distribution) in both conceptualiza- 
tions: * 


18. Considerate of others in his personal relation- 
ships. 
16. Understands the feelings of others. 


The following are examples of items con- 
sidered by the subjects to be “undesirable” 
(assigned to the lower extreme categories of 
the Q-sort distribution) in both conceptuali- 
zations: 


23. Likely to commit impulsive suicide. 


57. Likely to get uncontrollably angry with others 
for no good reason. 


The following are examples of items which 
were considered desirable for the healthy per- 
son, but “unimportant” (assigned to a middle 
category) for the ideal patient: 


12. Likes people for what they are. 
15. Knows his own assets and liabilities. 


The following are examples of items consid- 
ered to be undesirable for the healthy person 
and unimportant in the ideal patient: 


6. Makes inflexible all-or-none judgments about 
people which he won’t change even in the face of 
facts. 


8 The item numbers are those assigned in the origi- 
nal Q sort. 
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4. Believes in severe punishment for people he 
thinks are immoral. 


The following item was considered desir- 
able for the ideal patient but unimportant for 
the healthy person: 


14. Never 
reason. 


gets angry at people without good 


The following item was considered to be 
“undesirable” for the ideal patient but unim- 
portant for the healthy person: 


57. Takes the first opportunity he sees to satisfy 
an impulse. 


There were no items which were considered 
undesirable in one description and desirable 
in the other. 


Discussion 


The present research indicates a very sub- 
stantial level of over-all agreement in the 
evaluative conceptualizations of this group of 
psychologist-subjects. If we bear in mind that 
the reliability of the single subject in this 
study was about .80, it is evident that any 
intersubject reliability which approaches this 
value would indicate as great an agreement 
between subjects as exists within them indi- 
vidually. If this were the actual case, one 
would expect psychologists to be a sterile lot 
who never get into more disagreements with 
each other than they do with themselves on 
the basis of their own individual inconsist- 
ency. 

Actually, these results are about what one 
might hope for. The figures show sufficient 
agreement so that we might expect this group 
of psychologists to be agreed at least “in prin- 
ciple”; but the figures do not show sufficient 
conformity to indicate a rigidity of concep- 
tion throughout the sample. From the pa- 
tient’s point of view this would seem to indi- 
cate that one could be quite certain of re- 
ceiving a consistent evaluation regardless of 
which of these psychologists were diagnosing 
him (as long as there were no errors in the 
descriptive conceptualization itself), but that 
specific bits of behavior might have different 
meaning or importance depending upon the 
examiner. 

The “ideal” conceptualizations obtained 
here cannot be said to be right or wrong ex- 





226 


cept on the basis of the acceptance of some 
uniform set of standards. The data indicate, 
however, that these psychologists do recog- 
nize differences between the requirements of 
hospit ' and nonhospital society. They fur- 
ther reveal an emphasis upon flexibility and 
the need for recognizing others in one’s be- 
havior. One cannot be sure what causes these 
conceptualizations, but it is apparent that 
this sample is highly coherent as a group. It 
would be interesting in the future to examine 
less homogeneous populations and to make 
comparisons between the “ideals” of behav- 
ior which characterize various professional 
groups, or which characterize psychologists 
in various cultural and social environments. 
It would be equally interesting to study, for 
example, the development of evaluative con- 
ceptualizations as a student progresses through 
training in psychology. 

The Q-sort scale employed here was de- 
signed for use in describing the behavior of 
psychotic patients, and the extreme character 
of some of its items may have induced a level 
of agreement which is greater than one would 
normally expect. Nonetheless, this investiga- 
tion has demonstrated the feasibility of ex- 
amining these conceptual “ideals” by which, 
not only psychologists, but all people make 
judgments about each other. 


Summary 


A Q-sort investigation was made of twelve 
psychologists’ conceptualizations of the psy- 
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chologically healthy person and the ideal hos- 
pitalized patient. 

Equivalent Q-sort forms were used to es- 
tablish intrasubject reliabilities, and these 
were found to be .80 and .82 respectively for 
the two conceptualizations. 

Average intersubject agreements were found 
to be .68 for the psychologically healthy per- 
son descriptions, and .59 for the descriptions 
of the ideal hospitalized patient. These values 
were interpreted as indicating a high level of 
general agreement between subjects, but by 
no means as expressive of a unity of opinion 
on all specific trait items. 

The data were analyzed to reveal differ- 
ences between the two conceptualizations, and 
samples of specific items were presented to 
illustrate the findings. 


Received August 23, 1955. 
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The preponderance of research evidence in- 
dicates that there is a significant relationship 
between measured intelligence and socioeco- 
nomic level in that lower-class members do 
less well than those higher in the socioeco- 
nomic hierarchy on virtually all widely used 
intelligence tests. There is considerable dis- 
agreement among psychologists, sociologists, 
and allied social scientists as to the extent to 
which this positive relationship is a function 
of the “culture-bound” nature of the devices 
used to measure intelligence and the extent 
to which it is a function of true, innate dif- 
ferences in the intellectual capacities of vary- 
ing occupational or social classes. Probably 
no one would deny that test biases are re- 
sponsible for some of the measured class dif- 
ferences. 

In the hope of minimizing, in their test, the 
cultural bias assumed strong in most group 
intelligence tests, Davis ard Eells (1) in 
1953 issued the Davis-Eells Test of General 
Intelligence or Problem-Solving Ability, other- 
wise known as the Davis-Eells Games. The 
test yields an Index of Problem-Solving Abil- 
ity (IPSA) derived in the same fashion as an 
IQ. It is a power test, requires no reading, 
consists primarily of pictorial representations 
of multiple-choice problems for which the 
teacher reads the problem, and is said to be 
interesting and challenging to children of all 
social levels. It was designed for use in the 
first six grades of school and has a primary 
and elementary form. 

The appeal of such a test to school per- 
sonnel is strong. Many teachers, as well as 
laymen, are wary of standard intelligence 
tests. They know that not all their children 
are identified correctly by current tests. All 
of them would welcome an index of true 
learning ability which would identify those 


pupils who are and those who are not work- 
ing to capacity. Is the Davis-Eells Test such 
an index? 

Problems of test validation are nearly in- 
surmountable if one accepts the assumption 
of Davis and Eells that school grades, read- 
ing skill, and other measures of academic suc- 
cess cannot be accepted as validation criteria 
since these indices are strongly culture-bound, 
whereas the whole intent of the Davis-Eells 
Test is to minimize such social structuring. 
With what, then, can one correlate the test to 
assess some measure of its validity? The au- 
thors seem to accept no external criteria. The 
validity of the test must be taken on faith— 
that is, in consideration of the reasonableness 
of the test problems, the care taken in the 
eight years of test standardization, and the 
extent to which one agrees with the authors’ 
premises and their criticisms of other intelli- 
gence tests. 

Beyond its theoretical and research inter- 
est, however, a test must have some demon- 
strated relationship to some aspect of the 
school program to justify its widespread use 
for school purposes. Group intelligence tests 
now in use in schools have amply demon- 
strated effectiveness in predicting academic 
achievement, grouping pupils, aiding in vo- 
cational counseling, identifying underachiev- 
ers, and the like. That they do these things 
imperfectly is obvious, but that they do them 
better than they could be done without tests 
is equally apparent. 

In terms of practical considerations, then, 
if the Davis-Eells Test were to supplant pres- 
ently used intelligence tests in the schools, it 
would need to be demonstrated that the test 
could perform the aforementioned functions 
better. If it were to supplement other tests, it 
should shed some new insights regarding spe- 
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cific children or groups of children. Since no 
single set of criteria appears available to as- 
sess its value in these roles, particularly the 
latter one, perhaps an accumulation of ana- 
lytical studies may show ways in which the 
test could be used to advantage. 


Problem 


The intent of the present study was not to 
investigate the general validity of the Davis- 
Eells Games. Instead it poses several specific 
questions which, if the authors’ premises be 
accepted, do not necessarily bear upon true 
test validity but are nevertheless relevant to 
the test’s practical use in an elementary school 
setting: (a) What are the interrelationships 
among the Davis-Eells Test and the Cali- 
fornia Tests of Reading, Arithmetic, and 
Mental Maturity (3) for a typical sample of 
fourth-grade children? (6) Are the scores of 
bilingual children, which were not included in 
the standardization sample of the Davis- 
Eells Test though they are presumptively 
lowered by culture-bound items, significantly 
better on the Davis-Eells than on the Cali- 
fornia Test of Mental Maturity (CTMM)? 
(c) What relationships exist between social- 
class levels and scores on the above tests? 
(d) What types of children do better on the 
Davis-Eells Test than the CTMM, and vice 
versa? 


Subjects 


Since the Elementary A form of the Davis- 
Eells Games (hereafter referred to as the 
Games) is intended for use in grades 3-6, it 
was felt that choice of subjects from a fourth- 
grade sample would probably insure an ade- 
quate test ceiling as well as an adequate base 
for extreme scorers. Choice of fourth-graders 
was also a practical consideration since rou- 
tine group achievement and intelligence test- 
ing at the fourth-grade level was already part 
of a county-wide testing program. Use of a 
relatively homogeneous age group also elimi- 
nated the need to partial out age as a factor 
in subsequent test correlations. 

Children from all the fourth-grade classes 
of four elementary schools of Santa Barbara 
County participated in the study. The four 
schools, whose enrollments were between 300 
and 700, were within a radius of ten miles 
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from the city of Santa Barbara. Two of the 
schools drew their pupils from above-average 
residential areas, while the other two included 
a higher percentage of children of agricultural 
workers. The first two schools had few bilin- 
gual children, whereas in the latter two, ap- 
proximately one child in five came from a 
home in which Spanish was spoken. 

Prior test surveys in the four schools had 
suggested that, taken together, they would be 
a reasonable cross section of a fourth-grade 
school population. This later proved to be the 
case, with some minor exceptions to which 
reference will be made later. 

A total of 184 children from the classes of 
nine teachers in the four schools constituted 
the final sample to whom all tests were given. 
There were 93 boys and 91 girls. These 
pupils represented 81 per cent of the total 
actual enrollment of all fourth-graders in the 
four schools. No children were intentionally 
omitted from the study, but absence on one 
or another of the testing days and other un- 
avoidable circumstances prevented perfect 
participation. 


Method of Test Administration 


Four group tests were administered to all 
the children in their regular classrooms by 
their regular teachers. These tests were the 
Games, Elementary A; the California Test of 
Mental Maturity (CTMM), 1950 Elemen- 
tary S-Form; the California Reading Test, 
Elementary AA; and the California Arith- 
metic Test, Elementary AA. 

All tests were given during the first two 
months of the fall school term and rotated 
so that in some classes the Games were given 
first, in others the CTMM, and so on. Test- 
ing did not start until the term had been 
under way for several weeks. The two intelli- 
gence tests were given within a period of two 
weeks of each other in each class, and all tests 
in all four schools were administered within 
a calendar month and five days. Some time 
interval between tests was felt imperative by 
the teachers to avoid student fatigue and pre- 
vent undue interference with the school pro- 
gram. 

Cumulative folders for all pupils were made 
available by the schools. Information relating 
to parental occupation, birthplaces of parents 
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and child, language of the home, address, and 
date of child’s enrollment in the given school 
were recorded when available. 


Results 


Test interrelationships for total sample. As 
far as mean scores on the two intellectual 
measures were concerned, the CTMM IQ and 
Games IPSA were virtually identical. Both 
were only a point above the norms of the 
standardization samples. Since the mean age 
of the children was exactly that of the Cali- 
fornia Test Bureau figure for fourth-graders 
in their second month of the school term, the 
sample may be considered representative in 
this respect also. 

In several respects it deviates slightly from 
the norm, however. The standard deviation of 
IQ’s for the CTMM and its subtests exceeds 
the norm, while that for the Games is less 
than the standard deviation indicated in sev- 
eral studies reported in the Manual (1). 

More striking is the discrepancy between 
CTMM Language and Non-Language quo- 
tients favoring the latter. This is perhaps due 
in part to the scores of the bilingual children, 
whose relatively greater aptitude on non- 
verbal tests might be anticipated. 

The reading-grade placements of the chil- 
dren, as can be seen from the table, were at 
the expected level. Their arithmetic scores 
were slightly below. The latter is perhaps due 
in part to differential rates of summer for- 
getting, with arithmetic scores the more likely 
to suffer from holiday disuse.” 

Intercorrelations of the intelligence tests as 
shown in Table 2 indicate that the CTMM 
and Games are related less closely than most 
widely accepted intellectual measures. As one 
might have expected, the Games are more di- 


1 It was originally planned that the reasoning and 
fundamentals subtests of the arithmetic test should 
be given separate statistical treatment and reported 
separately in all ensuing tables. It has been tenta- 
tively hypothesized that arithmetic fundamentals, not 
depending on reading as did the reasoning problems, 
might bear a greater degree of relationship to the 
Games and CTMM Non-Language than to the other 
tests. It had also been hypothesized that bilingual 
children might perform significantly better on the 
fundamentals test. Since neither of these hypotheses 
was supported by the data, only the total arithmetic 
score is reported herein. 
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Table 1 
Characteristics of Sample with Respect to 
Age and Test Results 
Item Mean SD 
Age 9 yr. 4.3 mo. 6.4 mo. 
Davis-Eells IPSA 101.0 15.5 
Davis-Eells Raw Score 39.2 7.0 
CTMM Total IQ 101.3 18.0 
CTMM Language IQ 97.2 19.3 
CTMM Non-Lang. IQ 107.4 22.7 
Calif. Read. Gr. PI. 4.1 1.4 
Calif. Arith. Gr. Pl. 3.9 8 
Table 2 


Pearsonian Correlations Between Measures 
of Intelligence 





CTMM IQ 


Non- 





Test Total Lang. Lang. 
Davis-Eells IPSA 49 37 46 
CTMM Total IQ 86 .76 
CTMM Language IQ 42 








rectly related to the CTMM Non-Language 
Test, which requires no reading, than to the 
Language Test, which does. Still the CTMM 
Total IQ has the highest r with the Games. 
The latter correlation of .49 is reasonably 
typical of those reported in the Manual for 
the Games vs. several forms of the Otis. 
The relationships obtained from correlation 
of the Games with the achievement tests were 
also remarkably similar to comparable calcu- 


, lations reported in the Manual, where median 


coefficients of .43 for the Games vs. reading 
and .41 for the Games vs. arithmetic were re- 
ported for several batteries of achievement 
tests. It can be seen from Table 3 that the 


Table 3 


Pearsonian Correlations Between Measures of 
Aptitude and Achievement 














Davis- CTMM 
Eells - 
Achievement Test Raw Tot. Lang. N-L 
Grade Placement Score MA MA MA 
California Reading 48 79 77 49 
California Arithmetic A3 Si $8 52 
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CTMM Total has the highest degree of as- 
sociation with both indices of achievement. 
The CTMM Language has the next highest 
correlations, followed by the Non-Language 
which exceeds the Davis-Eells only slightly. 
Some changes in these correlations would 
have been effected by corrections for range, 
since the CTMM range was too wide and the 
Davis-Eells too narrow, reducing the correla- 
tions with the CTMM and increasing those 
with the Davis-Eells. 

The bilingual sample. A bilingual child was 
herein defined as one whose school records 
included the written notation that a language 
other than English was spoken in the home. 
Such children, 31 in number, constituted 17 
per cent of the total sample. Spanish was the 
second language of 27 of the children, Japa- 
nese of two, and German and Italian of one 
each. 

It was interesting to find that all but three 
of the children were born in the United States 
and all but eight in Santa Barbara County. 
The majority of the fathers of these children 
had been born in the countries whose lan- 
guage they spoke, while the majority of the 
mothers were born in the United States. One 
might infer, then, that some English was also 
spoken around the homes of many of these 
children. Yet there was some presumptive 
language handicap operating which might be 
expected to influence adversely the score of 
the bilingual child on written tests. 

Furthermore these children were also with 
few exceptions from underprivileged socioeco- 
nomic levels. Sixteen of their fathers worked 
as laborers, two as custodians, and the ma- 
jority of the remainder in unskilled or am- 
biguously stated occupations. One barber, one 
painter, and one ranch foreman were the ex- 
ceptions to the homogeneous parental occu- 
pational pattern of these bilingual children. 
The education of the fathers was also lim- 
ited, and only five were listed as having gone 
beyond the junior high school level. 

A comparison of the scores earned by the 
bilingual children on the various tests can be 
seen in Table 4. Their best intelligence test 
score was on the Games, followed closely by 
the CTMM Non-Language. As might be ex- 
pected, the children’s mean Language IQ was 
the lowest of the three. The difference of 11 
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Table 4 
Test Results for Bilingual Sample 














Test Mean SD 
Davis-Eells IPSA 97.7 15.4 
CTMM Language IQ 86.7 14.3 
CTMM Non-Language IQ 93.8 24.9 
Calif. Reading Grade 3.7 1.3 
Calif. Arith. Grade 3.7 8 





IQ points between the Games IPSA and 
CTMM Language IQ is significant at the .01 
level of confidence, but the 4-point difference 
between the Games and CTMM Non-Lan- 
guage had a critical ratio of only .74 and is, 
therefore, insignificant. 

Parental occupation and measured intelli- 
gence of children. It had been hoped origi- 
nally that it would be possible to obtain an 
Index of Status Characteristics for each child 
in the study, following the method described 
by Warner, Meeker, and Eells (4). Unfor- 
tunately, the information recorded in the 
available school records was inadequate for 
such a detailed appraisal. Even such items as 
parental occupation were frequently stated in 
an equivocal manner, e.g., “Self-employed.” 
Nevertheless, parental occupation was de- 
cided upon as the best single available index 
of class level, and each child was given a 
rating of from 1 to 7 following the categories 
in the aforementioned reference. 

The resultant distribution of scores was 
definitely overweighted in the professional 
and semiprofessional areas, with 17 per cent 
of the children so categorized. The parental 
occupations of 58 children unfortunately were 
either not given or not classifiable because of 
an inadequate description. Because of these 
marked limitations in the data, detailed sta- 
tistical treatment seemed inadvisable. Never- 
theless a rough grouping of the classes was 
made and mean IQ’s computed. Results are 
shown in Table 5. 

The fourth occupational class—skilled work- 
ers—had mean quotients of 105.0 and 103.9, 
respectively, on the CTMM and Games; in- 
cluding them in the grouping of the upper 
classes would have increased the significance 
of the difference between the test means to 
the .01 level of confidence. As it is, the dif- 
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Table 5 


Parental Occupation and Intelligence Test Scores 





CTMM IQ 





Occupational 
class N Mean SD 
Top 3 classes 41 113.1 10.2 
Bottom 3 classes 67 95.6 16.1 





ference is significant at the .02 level, and in 
the expected direction: children from above- 
average backgrounds do better on the tradi- 
tional type of intelligence test than on the 
Games. Yet the reverse does not hold true to 
a statistically significant degree; the children 
from the less privileged classes as here de- 
fined did not do significantly better on the 
Games than on the CTMM. The trends are 
in the expected direction but their magnitude 
is minimal. 

The difference in variability in the two tests 
is also of interest. The difference between the 
CTMM standard deviations of scores for the 
top as opposed to bottom three classes is sig- 
nificant at the .01 level. That is, the top three 
classes have significantly less variability on 
the CTMM than do the bottom three classes. 
On the other hand, the variability of Games 
scores for the opposing occupational group- 
ings is roughly the same. The CTMM scores 
appear closer to what one would expect of a 
“culture-bound”’ test. 

The scores of children whose parental oc- 
cupations were not given or were ambiguously 
stated had means of 98.4 and 100.1 on the 
CTMM and Games, respectively. It is pos- 
sible that there was a disproportionate share 
of lower-class children in this group, but even 
including them with the lower groupings 
would have affected the findings but little. 

Correlates of intelligence test discrepancies. 
Davis and Eells had indicated that problem- 
solving ability as measured by the Games 
was not to be strictly equated with standard 
concepts of intelligence as measured by most 
current group tests. It seemed of interest, 
therefore, to examine in some detail the 
characteristics of those children whose IPSA 
and IQ were markedly in disagreement, to 
see if meaningful correlates of the variable 
could be determined. A difference of 20 or 
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Davis-Eells IPSA 
Mean SD Diff. CR 
106.4 13.4 6.7 2.55 


97.9 


14.0 2.3 88 


more points between the scores obtained by 
a given child on the two tests was chosen 
rather arbitrarily as a cutting point which 
should indicate both statistical and diagnostic 
significance. Table 6 described the children 
who so excelled relatively on the Games and 
CTMM respectively. 

One child in four had an IQ-IPSA discrep- 
ancy of 20 points or more. It can be seen that 
the children performing markedly better on 
the Games than on the CTMM constituted 
a slightly larger group than its opposite 
analogue: 14 per cent vs. 11 per cent of the 
total sample. As would have been hypothe- 
sized, more bilinguals excelled on the Games 
than CTMM. 

The sex ratio is perhaps the reverse of what 
might have been anticipated in view of the 
stereotype of the greater verbalism of girls. 
More girls actually do better on the Games 
while more boys do better on the CTMM. 

The test scores of the children excelling on 


Table 6 


Characteristics of Children with IPSA-CTMM 
Discrepancies of 20 or More Points 





IPSA CTMM 
Item Greater Greater 
Number of children 25 21 
Number of bilinguals 9 1 
Sex ratio, boys/girls 11/14 2/9 
Mean Davis-Eells IPSA 112.8 96.0 
Mean CTMM Total IQ 86.4 123.9 
Mean CTMM Language IQ 82.0 122.0 
Mean CTMM Non-Language IQ 92.0 124.0 
Mean Reading-Grade Placement 3.6 5.3 
Mean Arithmetic-Grade Placement 3.7 4.5 
Parental occupation : 
Classes 1-3 2 4 
Class 4 3 4 
Classes 5-7 11 6 
Unknown ) 7 
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the Games over the CTMM present, except 
for the IPSA, the test-score pattern of poor 
students. Their IQ’s on the CTMM< and its 
subportions are dull-normal to low-average; 
their achievement is below their grade place- 
ment in both reading and arithmetic. Yet in 
terms of the Games, the children are nearly 
a sigma above the mean in problem-solving 
ability. 

Conversely, the 21 children with higher 
test scores on the CTMM are classified by 
the latter as superior in intelligence; they are 
also well above average in both reading and 
arithmetic. Had one evaluated these children 
by their mean problem-solving scores on the 
Games, however, they would have been de- 
scribed as slightly below average. 

Both groups of children, nevertheless, have 
achievement scores that show a definite re- 
gression toward the mean from their CTMM 
predicted levels. By the latter, the slower 
group is overachieving and the brighter group 
underachieving. The former group is nearly 
a sigma below the mean in Total CTMM 
IQ, yet only about one-half sigma below in 
achievement. The latter group is nearly a 
sigma and a half above the mean in CTMM 
IQ, yet not a full sigma above in measured 
achievement. 

Parental occupational levels follow to some 
extent the expected trend in that a higher 
proportion of upper-class children did better 
on the CTMM than on the Games, whereas 
a higher proportion of lower-class children 
did better on the Games than on the CTMM. 


Summary and Conclusions 


A group of 184 fourth-grade children, rea- 
sonably representative of fourth-graders in 
terms of age and measured intelligence, were 
given a battery of tests including the Davis- 
Eells Games, the California Test of Mental 
Maturity, and the California Tests of Read- 
ing and Arithmetic. Interrelationships among 
the tests were investigated, with special refer- 
ence to the performance of bilingual children 
and those from varying social-class levels. 
While the general validity of the Davis-Eells 
Games cannot be inferred from the data ob- 
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tained because of the test authors’ insistence 
that academic performance is not an accep- 
table validation criterion, it may still be as- 
sumed from the relatively low correlations 
obtained that the Games should not be used 
in place of standard group intelligence tests 
for predicting school success, grouping pupils 
for instructional purposes, and similar tradi- 
tio..al uses. 

Comparison of the scores derived from the 
Games with those from the CTMM for 
bilingual children and children at various 
social-class levels suggested, though not con- 
clusively, that children with a bilingual back- 
ground and/or lower-class membership might 
do better on the Games than the CTMM. 
Trends were so slight that they need verifica- 
tion through other studies. Because of an 
atypical distribution of parental occupations, 
it was deemed inadvisable to compute cor- 
relations between socioeconomic status and 
aptitude. Havighurst (2) reports coefficients 
of .37 and .34 between the Games and socio- 
economic status for fourth-grade girls and 
boys, respectively, from a midwestern com- 
munity of approximately the same size as the 
one from which these data were obtained. The 
findings in combination suggest that what- 
ever gains lower-class children make on the 
Games as opposed to other intelligence tests 
would be slight. Use of the test in elemen- 
tary schools at present except as a research 
instrument does not appear to be warranted. 


Received August 17, 1955. 
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A Procedure for Obtaining Self-Ratings 
and Group Ratings’ 


Wilse B. Webb’ 
U. S. Naval School of Aviation Medicine 


This is a report of the development of an 
effective technique for obtaining peer ratings 
and self-ratings of individual characteristics. 

The procedure grew from research con- 
cerned with relations between group evalua- 
tions, self-evaluations, and objective meas- 
ures (1, 3). In this research an odd obstacle 
developed. It was possible to obtain reliable 
group ratings by peer-nomination or rating- 
scale procedures for such characteristics as 
intelligence, leadership, personal charm, and 
the like. The difficulty lay in the self-ratings. 
Using either a self-ranking or a rating-scale 
procedure these ratings were quite unreliable. 
It was clear that until the reliabilities of the 
self-ratings could be increased the meaning- 
fulness of our interrelations was limited. 

As noted in a previous paper (3), manipu- 
lations of reliability formulas indicated that 
the high reliability of the group ratings ac- 
crued primarily from the number of ratings 
obtained on a given individual. Further, the 
lack of reliability of the self-ratings, using 
rating-scale or ranking procedures, yielded 
low reliabilities because of the “shortness” or 
“single item” characteristic of the rating. 

There were two alternatives available for 
increasing the reliability of the self-ratings. 
One alternative is to turn to self-inventories 
in which a large number of related but differ- 
ent self-statements are made which are likely 
to be related to a given trait. For example, 
we could use the Guilford-Martin inventory 


1 Opinions and conclusions contained in this report 
are those of the author. They are not to be con- 
strued as necessarily reflecting the view or the en- 
dorsement of the Navy Department. 

2My thanks are due to and freely given to Ed- 


ward J. Wallon for his help in data collecting and 
calculations. 


to measure friendliness. On the other hand, 
we may attempt to obtain a large number of 
self-statements on the same question. For ex- 
ample, “How friendly are you?” could be 
asked a number of times. This procedure 
pursues the latter alternative. 

For convenience’s sake, the procedure is 
titled the “SPM Procedure.” It will become 
clear in discussing the procedure that these 
initials stand for “self-plus-minus.” 


Procedure 


To increase the number of statements that 
the man made about himself, a method of 
paired comparison was introduced. Very sim- 
ply, the individual was given a roster of the 
individuals within his group. He was asked 
to go through this roster and compare him- 
self with every other man in the group. If he 
considered himself superior on a particular 
trait to a given man, he assigned a plus by 
that name. If in contrast he considered him- 
self inferior on a given trait, he assigned a 
minus to that name. His self-rating rank 
within the group then could be given by sim- 
ply counting the number of pluses which he 
assigned to the members of his group and 
subtracting from the total NV. If, for example, 
he had given 15 pluses in a group of 24 men, 
he considered himself to be superior to 15 
men in that group and 8 men to be inferior 
to him, and his rank on that particular trait 
from his point of view was 9 in the group 
(24-15 = 9). 

In working with this SPM procedure, an 
additional benefit was noted. By summing the 
total number of pluses assigned an individual 
by all members of the group (the number of 
men who consider themselves superior to an 
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Table 1 
A Model of the SPM Rating Procedure for Five Cases 











Group 
A B C D E ranks 
A _-_ — — + 2 
B + —- + + 4 
C + + + + 5 
D + _ ~ + bo 
E , + -. + 4 
Self-ranks 1 3 5 2 1 





+ =column designee considers himself superior to row 
designee. 

— =column designee considers himself inferior to row 
designee. 

Self-ranks = 5 — pluses of each column. 

Group ranks = 1 + pluses of each column. 


individual) and by adding 1 a ranking of the 
group judgment of this man could be ob- 
tained. An example is given in Table 1. Let 
us assume that there are five men in a given 
group. The columns of Table 1 represent the 
pluses and minuses assigned by each indi- 
vidual to the other members of the group on 
a given trait. The score at the end of each 
column (5-the sum of the pluses) indicates 
the man’s self-rank which he assigns himself. 
When the table is constructed in this man- 
ner, it is quite apparent that the rows can 
represent the judgment of a given individual 
by his entire group and that summing across 
the rows the relative ranks of individuals as 
judged by his fellow members in the group 
on a given trait may be obtained. If, for ex- 
ample, as in the case of man A, 3 members 
of the group considered him superior to them- 
selves, he is clearly a superior man in terms 
of group judgments on this trait. If in con- 
trast (as in the case of man C) all members 
considered themselves superior to him, he is 
then considered an inferior member in this 
particular group. In the case of man E, he 
considers himself superior to all members of 
the group whereas only one member of the 
group rates him as superior. 


Results 


Eight sections have been tested using the 
SPM procedure in the fourth week and fif- 
teenth week of preflight training of the Naval 
Air Training Program. These sections ranged 
in number from 21 to 34 members. The char- 
acteristics of intelligence and leadership were 
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used. Estimates of the reliability within a 
given session and between sessions for the 
two characteristics were obtained for both 
the group ratings and the self-ratings. These 
data are summarized in Table 2. 

The between-session reliabilities of the rat- 
ings given in Table 2 are the averages of the 
product-moment correlations for each of the 
eight sections between the fourth- and fif- 
teenth-week administration of the forms. The 
within-session reliabilities are the averaged 
reliability estimates within the fourth- and 
fifteenth-week administrations obtained for 
each section. These estimates were made by 
an analysis-of-variance procedure (comparing 
the row and column variances with the re- 
maining error variance). The average num- 
ber for each section in the fourth week was 
28.7 and for the fifteenth week the average 
N was 25.9. 

These reliability estimates may be com- 
pared with the reliabilities of group ratings 
obtained by peer-nomination procedures and 
rating-scale procedures from data on com- 
parable populations. The widely used peer- 
nomination technique has been applied to a 
number of cadet populations. The group mem- 
bers were asked to write the names of the 
three “highest” in their section on leadership 
potential and also the three “lowest” on lead- 
ership potential. Group measures were devel- 
oped by assigning a plus when a man was 
nominated “high” and a minus when a man 
was nominated “low” and algebraically sum- 
ming the pluses and minuses for each indi- 
vidual within the group. An average within- 
session reliability of this technique for six 
sections of cadets has been reported as .88 
(2). The total number of cases in the study 


Table 2 


Reliabilities Within and Between Sessions for SPM 
Groups and Self-Rating on Intelligence 
and Leadership 











Intelli- Intelli- Leader- Leader- 
gence gence ship ship 
group self- group self- 

Week rating rating rating rating 
4th .90 91 92 .90 
15th 92 90 92 88 
4th-15th 76 58 75 69 
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was 116. A between-fourth-and-fifteenth-week 
session reliability on three sections of cadets 
gave a reliability estimate of .78 for a total 
of 68 cases (unpublished data). In both of 
these instances the trait of leadership was 
considered. Group ratings have also been ob- 
tained by a 7-point rating-scale procedure in 
which each man rated every other man in his 
section and himself on intelligence. This pro- 
cedure yielded within-session reliabilities of 
.86 and .89 at the fourth week and fifteenth 
week, respectively, and a between-session re- 
liability of .76 (3). 

The reliability of two methods of obtaining 
self-ratings may also be compared with the 
SPM procedure. A between-session reliability 
of leadership rankings (each subject was 
asked to assign himself a rank number within 
the total number of people in his section) 
was obtained for three sections of cadets. The 
average between the fourth- and fifteenth- 
week administrations for the three sections of 
cadets (total V of 68 cases) was .34 (unpub- 
lished data). The rating-scale procedure (each 
individual rating himself on a 7-point scale) 
yielded a between-session of .19 for 95 cases 
(3). 

The reliability findings may be summarized. 
The group-rating reliabilities both within and 
between sessions for three methods are quite 
comparable. The SPM procedure enjoys a 
considerable advantage in the self-rating re- 
liabilities between sessions. 

The comparability of the SPM procedure 
for group ratings and the peer-nomination 
procedure was determined. The two alterna- 
tive procedures were used in the fifteenth- 
week sections in which SPM data of Table 2 
were obtained. The averaged correlation be- 
tween the group ratings obtained by the two 
procedures was .80. In addition to the peer- 
nominations the rank-order procedure for ob- 
taining self-ratings was also introduced at the 
fifteenth week for the eight sections. These 
rankings correlated .34 with the SPM self- 
rating ranks. 

The relationships between the measures ob- 
tained by the SPM procedure with objective 
measures can be roughly compared with those 
relationships obtained by the rating pro- 
cedure on the variable of intelligence. This 
comparison, however, is only approximate in 


Table 3 


Correlations Between SPM Method and the ACE and 
Rating-Scale Method and Otis for Group 
Rating and Self-Rating 


Group Self- 

rating rating 
SPM 46 .26 
Rating Scale 49 21 





that the object measure for the SPM method 
is the ACE and in the case of the rating-scale 
study, this measure was the Otis self-adminis- 
tering test (3). The data are given in Table 3. 


Summary 


Summary statements of the available sta- 
tistical data concerning the SPM procedure 
and its comparison to other methods can be 
given: 

1. The SPM procedure yields highly reli- 
able measures within sessions for both group 
ratings and self-ratings. 

2. The between-session reliabilities of the 
group ratings are high and the between-ses- 
sion self-ratings are respectable. 

3. The SPM procedure increases the reli- 
ability of between-session self-ratings when 
compared with self-ratings given by rank 
designation procedure or 
rating scale. 

4. The group ratings obtained by the SPM 
method are considerably correlated with the 
peer-nomination procedure for group ratings. 

5. The self-ratings of the SPM procedure 
within sessions show only a limited relation 
with a ranking procedure. Comparing the be- 
tween-session reliability would favor the use 
of the SPM method. 

6. The group ratings show substantial and 
similar relations to objective measures when 
compared with those obtained by a rating 
procedure. 

7. Self-ratings show low and similar rela- 
tions to objective measures when compared 
with those obtained by a rating procedure. 

In addition to these statistical facts, a num- 
ber of “fringe benefits” may be cited for the 
procedure outlined in this paper: 

1. Measures of intrasession reliability are 
available for the self-ratings. 


a self-rating on a 
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2. Group ratings are obtained indirectly 
without the man being forced to name high 
or low extremes of his group or to rate indi- 
viduals extremely. 

3. The procedure is extremely simple to 
administer and requires a shorter time than 
the other procedures. It can be easily ad- 
ministered on mark-sensing cards or on IBM 
answer sheets. We use the later procedure in 
which each name of the class roster is as- 
signed an answer space and the rater marks 
answer space “a” if he considers himself su- 
perior for a given item (name-number) or 
“b” if he considers himself inferior. 

4. There is a common set for all subjects 
in the case of the self-ratings. For example, 
the other procedures permit the individual to 
rate as he thinks the others will rate him or 
as he thinks he should be rated. 


5. The value of a trait to a group may be 
obtained indirectly by counting the total 
number of pluses assigned a trait by the 
group. Indirect measures of morale may be 
obtained through such a plus-minus counting 
for the total group. 


Received August 24, 1955. 
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New Tests 


Bennett, George K., & Gelink, Marjorie. Short Em- 
ployment Tests (SET), 1956 Manual. New York: 
Psychological Corp., 1956. Pp. 11. 

The second revision of the manual for the Short 
Employment Tests (see J. consult. Psychol., 1952, 
16, 159; 1953, 17, 401) contains correlations of the 
tests with job criteria in 19 business situations. The 
norms for job applicants are now based on more 
than 33,000 women and 9,000 men.—L. F. S. 


Brown, William F., & Holtzman, Wayne H. Brown- 
Holtzman Survey of Study Habits and Attitudes 
(SSHA), 1956 Manual. New York: Psychological 
Corp., 1956. Pp. 11. 


The revised manual contains technical data for 
high school samples as well as for college students 
(see J. consult. Psychol., 1954, 18, 153-154). At the 
high school as well as at the college level, the SSHA 
shows substantial correlation with achievement, has 
low correlation with the ACE, and adds significantly 
to the multiple correlation. Percentile norms are 
given for college and high school groups, derived 
from 3,500 and 2,800 cases, respectively —L. F. S. 


Buhler, Charlotte, & Manson, Morse P. The Picture 
World Test. Individual projective technique. Child- 
adolescent-adult. 1 form. Untimed, (30) min. Set 
of materials ($18.50 for 25 administrations), with 
manual, pp. 86; manual only ($3.50). West Los 
Angeles, Calif.: Western Psychological. Services, 
1956. 

The Picture World Test is essentially a picture 
version of the well-known World Test, designed to 
explore how a person perceives and structures his 
world, with emphasis on his goal-setting activities. 
The examinee is presented with 12 small drawings of 
structured scenes, a large sheet of blank paper, and 
a chart of 36 simple symbols by which he can add 
persons, vehicles, buildings, and animals to the scenes. 
He is asked to choose any number of the scenes he 
wishes “to make up a world as it is or as you would 
like it to be; the world you like or dislike; the 
world of your dreams. . . .” The examinee gums the 


scenes to the blank sheet, connects the scenes as he 
wishes, adds symbols, gives his world a name or 
title, and tells a story about it. 

The manual describes methods for interpreting the 
world scenes and world stories. Some data are given 
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on the frequencies of the main story categories in re- 
lation to age and quality of adjustment. Twenty- 
eight illustrative case studies demonstrate the clinical 
use and interpretation of the test. 

All in all, the Picture World Test seems to be 
an interesting and challenging instrument, likely to 
evoke self-revealing responses of considerable clini- 
cal value—L. F. S. 


Cureton, Edward E., Cureton, Louise W., et al. The 
Multi-Aptitude Test. For demonstration in classes 
and lay groups. 2 forms. 35 (50) min. Study kit, 
with 2 test forms, keys, and manual, pp. 32 ($1.25 
ea., $9.00 per 10); test booklet separately, form A 
or B ($3.50 per 25). New York: Psychological 
Corp., 1955. 

The Cureton Multi-Aptitude Test is unique. Not 
intended for any practical purpose of measurement, 
it is designed to teach and demonstrate the nature 
of psychological tests to a wide range of interested 
persons. Advanced classes may use the test data for 
item analyses and studies of test intercorrelations; 
nonprofessional groups may satisfy their curiosity 
about what tests are. The 10 short subtests represent 
most of the common varieties of test item. Five 
parts—vocabulary, information, arithmetic, number 
series, and figure classification—are typical of group 
mental tests and also illustrate verbal, number, and 
reasoning factors. The remaining subtests—mechani- 
cal comprehension, word recognition, scrambled let- 
ters, checking, and paper form board—typify other 
abilities found by factorial studies and used in spe- 
cial aptitude tests. The keys acquaint students with 
various types of scoring devices. The manual is an 
excellent miniature specimen for instruction, with 
full tables of data and norms for practice, although 
not for practical interpretation. The test should be 
widely useful as an unrestricted instrument for in- 
struction, thereby sparing other tests whose security 
needs protection—L. F. S. 


Phillipson, Herbert. The Object Relations Technique. 
Individual projective technique. 1 form. Adoles- 
cent-adult. Average time (90) min. Set of 13 
plates, with manual, pp. x + 224, cloth ($10.00); 
manual only ($6.00). Glencoe, Ill.: The Free Press, 
1955. 

The Object Relations Test is a new projective 
method which bears some resemblance to the TAT, 
developed at the Tavistock Clinic at London. Its 
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novel features are its rationale which springs from 
a theory of personality, and its stimulus pictures 
which were designed so as to vary certain impor- 
tant characteristics systematically. The test’s name 
and intent come from the psychoanalytic theory of 
unconscious object relations which exist within the 
personality, were derived from the individual’s re- 
lations with significant persons in his earlier life, 
and in turn determine his relationships with the 
real people in his present external world. 

Twelve of the pictures—the thirteenth is a blank 
card—are divided into three series of four plates 
each. The series vary in degree of structure, from 
vague, hazy figures and backgrounds in series A, to 
detailed, realistic backgrounds in series C. In all se- 
ries, the human figures are ambiguous as to sex and 
lack facial features. The four cards within each se- 
ries represent one figure, two figures, three figures, 
and a group. Administration is similar to that of 
the TAT. The cards are not shown in serial order, 
but in a designated order which mixes the series and 
the numbers. 

The accompanying book discusses the theory on 
which the test is based, describes the materials and 
their administration, and gives methods for the 
analysis and interpretation of the stories. The inter- 
pretation of one protocol is given at length, and six 
other cases are presented more briefly. Normative 
information is given for 50 clinic outpatients and 
for a sample of 40 normal adolescent girls. Although 
such data may seem scanty if judged by standards 
appropriate to mass testing, they exceed the mate- 
rial usually offered with a new projective method 
and are unquestionably useful. 

The pictures are sensitively conceived, and their 
systematic plan commends them both for clinical 
use and for research. This instrument deserves a 
thorough exploration by American psychologists and 
may well prove to be a major development in pro- 
jective methods.—L. F. S. 


Segel, David, & Raskin, Evelyn. Multiple Aptitude 
Tests. Grades 7-13. 1 form. 175 (210) min. 9 test 
booklets ($2.45 to $4.55 per 35, each; set, $24.50 
per 35), with manual, pp. 96, keys; 2 IBM an- 
swer sheets (4¢); extended or transparent profile 
sheets (2¢); sample set ($1.75). Los Angeles: Cali- 
fornia Test Bureau, 1955. 

The Multiple Aptitude Test offers another battery 
of differentiated tests for high school students and 
college freshmen. The nine subtests—word meaning, 
paragraph meaning, language usage, clerical facility, 
arithmetic reasoning, arithmetic computation, applied 
science and mechanics, spatial relations in two di- 
mensions, and spatial relations in three dimensions— 
may be combined to yield four factors, verbal, per- 
ceptual speed, numerical, and spatial. The 96-page 
manual gives unusually full data, although there are 
some conspicuous gaps. Reliabilities by Kuder-Rich- 
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ardson formula 21 range, in single-grade groups, 
from .72 to .92 for the subtests and from .91 to .95 
for the factors. Validity is discussed in terms of the 
factor analysis, correlations with other tests, and 
correlations with school marks in 18 subjects. The 
norms, obtained on a nationally distributed sample 
of over 11,000 cases in grades 7 through 13, are ex- 
pressed as standard scores and percentile ranks for 
each subtest. 

The data in the manual reveal a number of inter- 
esting strengths. Good attention is paid to the stand- 
ard errors of differences between scores, and users 
are cautioned against drawing conclusions from non- 
significant differences. An examinee’s test profile can 
be compared with those of high and low achievers 
in each school subject, and expectancy tables are 
provided which relate school marks to scores on the 
most predictive subtests. 

There are also some weaknesses. Most striking is 
the fact that all norms and interpretive data are 
given in terms of the nine subtest scores, while the 
more reliable combined factor scores are almost ig- 
nored. The manual gives little information about the 
selection of content and the development of the sub- 
tests. The data also disclose some perplexities. Why 
is the arithmetic computation subtest not only the 
best predictor of grades for general mathematics and 
general science, but also the best or close next-best 
for English, foreign languages, social studies, and 
psychology? Such an observation casts some doubt 
on the value of the tests for differential prediction 
in comparison to a single predictive score——L. F. S. 


Wagner, Mazie E., & Schubert, Herman J. P. D. A. 
P. Quality Scale for Late Adolescents and Young 
Adults. Buffalo 22, N. Y. (State University of 
New York College for Teachers): Authors, 1955. 
Pp. 23 + 28 plates. $2.00. 

A long-established way to evaluate a personal 
product globally is by the use of scaled specimens 
for comparison. Wagner and Schubert have produced 
four scales for rating drawings of a person—male 
and female figures, drawn as front and side views. 
Data in the accompanying monograph show that 
trained raters can use the scales with high reliability 
(> .90), that novices can learn to rate quickly, and 
that examinee retest reliability for drawings so rated 
is .85. Most interesting is the evidence that the draw- 
ing scores contribute significantly to the prediction of 
scholarship in a teachers’ college, and have low cor- 
relations with other predictors. The findings suggest 
that the DAP may be used to predict the achieve- 
ment of young adults, just as Goodenough used it 
to predict the achievement of young children many 
years ago. The scales also demonstrate the rewards 
which come from applying sound quantitative meth- 
ods to the over-all appraisal of a complex personal 
product.—L. F. S. 
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